git.apps.os.sepia.ceph.com Git - ceph.git/log

]> git.apps.os.sepia.ceph.com Git - ceph.git/log

projects / ceph.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

xie xingguo [Sat, 19 Jan 2019 09:19:10 +0000 (17:19 +0800)]

crush: fix upmap overkill

It appears that OSDMap::maybe_remove_pg_upmaps's sanity checks
are overzealous. With some customized crush rules it is possible
for osdmaptool to generate valid upmaps, but maybe_remove_pg_upmaps
will cancel them.

Fixes: http://tracker.ceph.com/issues/37968
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 5c4d241c7f796cb685e9944bf237028162122725)

Conflicts:
- maybe_remove_pg_upmaps input changing
- slight c++11 auto conflicts

commit | commitdiff | tree

xie xingguo [Mon, 18 Feb 2019 07:40:22 +0000 (15:40 +0800)]

osd/OSDMap: using std::vector::reserve to reduce memory reallocation

In C++ vectors are dynamic arrays.
Vectors are assigned memory in blocks of contiguous locations.
When the memory allocated for the vector falls short of storing
new elements, a new memory block is allocated to vector and all
elements are copied from the old location to the new location.
This reallocation of elements helps vectors to grow when required.
However, it is a costly operation and time complexity is involved
in this step is linear.
Try to use std::vector::reserve whenever possible if performance
matters.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 4a0eabb3a65107cbee5e692ade564102e2b2f8aa)

commit | commitdiff | tree

xie xingguo [Sat, 26 Jan 2019 10:03:15 +0000 (18:03 +0800)]

osd/OSDMap: more improvements to upmap

- add ability of appending a 2nd, 3rd, etc... pair to existing upmaps
  when possible, rather than just continuing to the next PG
- handle the underfull case: we can rm-pg-upmap-items if there exist
  any upmaps which remapped a PG out from an underfull OSD

See-also: http://tracker.ceph.com/issues/37940
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit a7d2adf4283cc11ed567b0ad07c4076a50d7d2a0)

Conflicts:
- various option changes since luminous
- s/assert/ceph_assert/
        - s/conf./conf->/

commit | commitdiff | tree

xie xingguo [Fri, 18 Jan 2019 10:14:52 +0000 (18:14 +0800)]

osd/OSDMap: be more aggressive when trying to balance

Previously we'd require the absolute deviation >= 1 before
an osd can fill in the overfull or underfull set, which as
a result can get some osds stuck severely underfull forever.

This patch tries to get us out of those corner cases mentioned
above and can be scrutinized from two aspects:
- a standard deviation is introduced to evaluate the efficiency
  of balancing, therefore making the distribution of pgs always
  converge to the perfect status (standard deviation == 0).
- populate overfull or underfull osds more aggressively by
  gradually allowing the absolute deviations converging towards
  to 0 instead of 1.

It turns out the balancer module works even better now after
applying this patch. E.g.:
```
OSD=5 MON=1 MGR=1 MDS=0 ../src/vstart.sh -x -l -b -n -d
bin/ceph osd set-require-min-compat-client luminous
bin/ceph balancer mode upmap
bin/ceph balancer on
bin/ceph osd pool create rbd 117
// wait until automatic balancing is done
bin/ceph osd pool create aaa 133
```
__before__:
```
ID CLASS WEIGHT  REWEIGHT SIZE   RAW USE DATA    OMAP META  AVAIL   %USE  VAR  PGS STATUS
0   hdd 0.00980  1.00000 10 GiB 1.1 GiB 3.9 MiB  0 B 1 GiB 9.0 GiB 10.60 1.00 151     up
1   hdd 0.00980  1.00000 10 GiB 1.1 GiB 3.9 MiB  0 B 1 GiB 9.0 GiB 10.60 1.00 153     up
2   hdd 0.00980  1.00000 10 GiB 1.1 GiB 3.9 MiB  0 B 1 GiB 9.0 GiB 10.60 1.00 147     up
3   hdd 0.00980  1.00000 10 GiB 1.1 GiB 3.9 MiB  0 B 1 GiB 9.0 GiB 10.60 1.00 149     up
4   hdd 0.00980  1.00000 10 GiB 1.1 GiB 3.9 MiB  0 B 1 GiB 9.0 GiB 10.60 1.00 150     up
                    TOTAL 50 GiB 5.3 GiB  19 MiB  0 B 5 GiB  45 GiB 10.60
```
__after__:
```
ID CLASS WEIGHT  REWEIGHT SIZE   RAW USE DATA    OMAP META  AVAIL   %USE  VAR  PGS STATUS
0   hdd 0.00980  1.00000 10 GiB 1.1 GiB 6.2 MiB  0 B 1 GiB 9.0 GiB 10.62 1.00 150     up
1   hdd 0.00980  1.00000 10 GiB 1.1 GiB 6.2 MiB  0 B 1 GiB 9.0 GiB 10.62 1.00 151     up
2   hdd 0.00980  1.00000 10 GiB 1.1 GiB 6.2 MiB  0 B 1 GiB 9.0 GiB 10.62 1.00 149     up
3   hdd 0.00980  1.00000 10 GiB 1.1 GiB 6.2 MiB  0 B 1 GiB 9.0 GiB 10.62 1.00 150     up
4   hdd 0.00980  1.00000 10 GiB 1.1 GiB 6.2 MiB  0 B 1 GiB 9.0 GiB 10.62 1.00 150     up
                    TOTAL 50 GiB 5.3 GiB  31 MiB  0 B 5 GiB  45 GiB 10.62
```

Fixes: http://tracker.ceph.com/issues/37940
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit b01095496b00f9e052bb185fbbaa6c8efeb81530)

Conflicts:
s/ceph_assert/assert/

commit | commitdiff | tree

xie xingguo [Sat, 12 Jan 2019 07:01:54 +0000 (15:01 +0800)]

osd/OSDMap: potential access violation fix

Seems we'll continue to access the iterator after it is invalidated
by the __erase__ method.
Also this is more efficient considering there could be some extreme
ec-pool (e.g., 8 + 2) consumers..

Fixes: http://tracker.ceph.com/issues/37881
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit df2e01f496dbc2b38800b1792082c540094d7b02)

commit | commitdiff | tree

huangjun [Fri, 24 Aug 2018 14:47:02 +0000 (22:47 +0800)]

osd/OSDMap: don't mapping all pgs each time in calc_pg_upmaps

We have a cluster pool with 32768 pgs and 400 osds, it costs 600 seconds when doing upmap
with '--upmap-max 32768 --upmap-deviation 0.01', which is pretty slow.
After adding some debug code, the time mostly spent on pg_to_up_acting_osds, the average
time for one pg_to_up_acting_osds is about 12us, so the whole pool's pg will cost 500ms each
time, we finally have 1429 pgs need to do upmap, so it cost about 600 seconds.
Withi this patch, it only spend 5 seconds to get job done.

Signed-off-by: huangjun <huangjun@xsky.com>
(cherry picked from commit da45e4e352b30cc4f6fd52f2f030bf7569eaee57)

commit | commitdiff | tree

Abhishek L [Wed, 23 Jan 2019 15:42:14 +0000 (16:42 +0100)]

Merge pull request #26088 from ukernel/luminous-37977

luminous: mds: fix infinite loop in OpTracker::check_ops_in_flight

Reviewed-By: Patrick Donelly <pdonelly@redhat.com>

commit | commitdiff | tree

Yan, Zheng [Wed, 23 Jan 2019 09:32:05 +0000 (17:32 +0800)]

mds: fix infinite loop in OpTracker::check_ops_in_flight

introduced by backport commit 02faf3d
"mds: don't report slow request for blocked filelock request"

Fixes: http://tracker.ceph.com/issues/37977
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>

commit | commitdiff | tree

Yan, Zheng [Wed, 23 Jan 2019 09:29:27 +0000 (17:29 +0800)]

Revert "mds: fix infinite loop in OpTracker::check_ops_in_flight"

This reverts commit adad2d873a37770e4910b04dfddf92f70964be4a.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Tue, 22 Jan 2019 16:38:04 +0000 (08:38 -0800)]

Merge pull request #26037 from yuriw/wip-yuriw-p2p-luminous

luminous: qa/tests: changed to `supported' distro and moved all `p2p` suites un…

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Sage Weil [Tue, 22 Jan 2019 16:09:06 +0000 (10:09 -0600)]

Merge pull request #25867 from ashishkumsingh/wip-37827-luminous

luminous: mgr: fix crash due to multiple sessions from daemons with same name

commit | commitdiff | tree

Yuri Weinstein [Fri, 18 Jan 2019 20:53:05 +0000 (12:53 -0800)]

qa/tests: changed to `supported' distro and moved all `p2p` suites under the same folder

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 21 Jan 2019 23:53:56 +0000 (15:53 -0800)]

Merge pull request #26061 from yuriw/wip-yuriw-p2p-luminous_3

qa/tests: changed start point from `luminous` to v12.2.10

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 21 Jan 2019 23:25:33 +0000 (15:25 -0800)]

Merge pull request #26048 from ukernel/luminous-37977

luminous: mds: fix infinite loop in OpTracker::check_ops_in_flight

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Noah Watkins [Mon, 21 Jan 2019 21:33:01 +0000 (13:33 -0800)]

Merge pull request #26036 from noahdesu/luminous-pr26015

luminous: cli: dump osd-fsid as part of osd find <id>

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Mon, 21 Jan 2019 21:28:07 +0000 (15:28 -0600)]

Merge pull request #26030 from alfredodeza/luminous-fix_raw_input

luminous ceph-volume: Adapt code to support Python3

Reviewed-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 21 Jan 2019 19:10:39 +0000 (11:10 -0800)]

qa/tests: changed start point from `luminous` to v12.2.10

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Mon, 21 Jan 2019 19:39:41 +0000 (13:39 -0600)]

Merge pull request #26014 from alfredodeza/luminous-bz1644847

luminous ceph-volume zap devices associated with an OSD ID and/or OSD FSID

Reviewed-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Alfredo Deza [Mon, 21 Jan 2019 17:15:45 +0000 (12:15 -0500)]

Merge pull request #25553 from jan--f/25238-luminous

luminous ceph-volume: introduce class hierachy for strategies

Reviewed-by: Alfredo Deza <adeza@redhat.com>

commit | commitdiff | tree

Yan, Zheng [Mon, 21 Jan 2019 02:08:51 +0000 (10:08 +0800)]

mds: fix infinite loop in OpTracker::check_ops_in_flight

introduced by backport commit 02faf3dc321
"mds: don't report slow request for blocked filelock request"

Fixes: http://tracker.ceph.com/issues/37977
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 18 Jan 2019 21:56:15 +0000 (13:56 -0800)]

Merge pull request #25307 from trociny/wip-37438-luminous

luminous: crushtool: add --reclassify operation to convert legacy crush maps to use device classes

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 18 Jan 2019 20:38:00 +0000 (12:38 -0800)]

Merge pull request #25949 from neha-ojha/wip-36686-luminous

luminous: osd/mon: pg log hard limit with upgrades fixed

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 18 Jan 2019 20:33:59 +0000 (12:33 -0800)]

Merge pull request #25804 from ashishkumsingh/wip-37758-luminous

luminous: mds: clean up log messages for standby-replay

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 18 Jan 2019 20:33:13 +0000 (12:33 -0800)]

Merge pull request #25904 from pdvian/wip-37829-luminous

luminous : client: fix fuse client hang because its pipe to mds is not ok4

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 18 Jan 2019 20:30:04 +0000 (12:30 -0800)]

Merge pull request #26011 from batrick/i37953

luminous: qa: test_damage needs to silence MDS_READ_ONLY

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Noah Watkins [Thu, 17 Jan 2019 19:16:44 +0000 (11:16 -0800)]

cli: dump osd-fsid as part of osd find <id>

Dumps the osd-fsid uuid as part of the `osd find <id>` command.
Currently this uuid is only available as part of `osd dump` but
ceph-ansible has a use case to interrogate a single osd without needing
the entire osdmap dump.

Fixes: http://tracker.ceph.com/issues/37966
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
(cherry picked from commit 4d01b2c96e0ea1122225c30da21bc39c40e15c0e)

commit | commitdiff | tree

Neha Ojha [Fri, 18 Jan 2019 19:01:18 +0000 (14:01 -0500)]

doc: pglog_hardlimit flag recommendations

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Neha Ojha [Fri, 11 Jan 2019 23:35:50 +0000 (18:35 -0500)]

qa/suites/upgrade/jewel-x/stress-split*: require-osd-release luminous after upgrade

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Neha Ojha [Thu, 10 Jan 2019 02:12:17 +0000 (18:12 -0800)]

include/rados.h: hide CEPH_OSDMAP_PGLOG_HARDLIMIT from ceph -s

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit dfc292923548fe1e1b5329555dd46949feb96b99)

commit | commitdiff | tree

Neha Ojha [Fri, 11 Jan 2019 02:40:30 +0000 (21:40 -0500)]

qa/suites/upgrade/luminous-p2p-stress-split: add split scenario

This commit adds stress-split test cases to test luminous against a point
release of luminous.

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Neha Ojha [Fri, 11 Jan 2019 00:25:32 +0000 (19:25 -0500)]

qa/suites/upgrade/jewel-x: add pg log settings

- vary pg log lengths
- test pglog_hardlimit flag

These are qa suites changes specific to luminous.

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Neha Ojha [Mon, 7 Jan 2019 23:26:27 +0000 (15:26 -0800)]

mon/OSDMonitor.cc: make a note about reusing jewel feature bit

For OSD_PGLOG_HARDLIMIT, we have reused a jewel feature bit that was retired
in luminous. Therefore, we need to check the release version for
>= CEPH_RELEASE_LUMINOUS, before using it.

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 6abcc20dca0ee5a08a3fe7c560750f904fe3fa65)

Conflicts:
src/mon/OSDMonitor.cc: trivial resolution

commit | commitdiff | tree

Neha Ojha [Thu, 20 Dec 2018 17:27:34 +0000 (09:27 -0800)]

mon: add and use OSD_PGLOG_HARDLIMIT feature bit

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 6b0a3ded5baabb19809618de16cdf67c925a8e5a)

Conflicts:
src/mon/OSDMonitor.cc: trivial resolution

commit | commitdiff | tree

Neha Ojha [Tue, 18 Dec 2018 00:20:10 +0000 (16:20 -0800)]

osd/mon: fix upgrades for pg log hard limit

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 6ceeab6e204286148a69eb147fbc4045beddef49)

Conflicts:
src/include/rados.h
src/mon/MonCommands.h
src/mon/OSDMonitor.cc
src/osd/OSDMap.cc
          Luminous does not have CEPH_OSDMAP_NOSNAPTRIM flag.
          In nautilus, CEPH_OSDMAP_PGLOG_HARDLIMIT is set by default,
          which is not the case in luminous.

commit | commitdiff | tree

Neha Ojha [Fri, 14 Dec 2018 23:59:24 +0000 (15:59 -0800)]

osd: bring back old calc_trim_to and rename new method

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 0536aef761fefda9a86d5bc72a12ca8cab931ccb)

commit | commitdiff | tree

xie xingguo [Mon, 30 Jul 2018 10:56:56 +0000 (18:56 +0800)]

osd/PrimaryLogPG: fix potential pg-log overtrimming

In https://github.com/ceph/ceph/pull/21580 I set a trap to catch some wired
and random segmentfaults and in a recent QA run I was able to observe it was
successfully triggered by one of the test case, see:

```
http://qa-proxy.ceph.com/teuthology/xxg-2018-07-30_05:25:06-rados-wip-hb-peers-distro-basic-smithi/2837916/teuthology.log
```

The root cause is that there might be holes on log versions, thus the
approx_size() method should (almost) always overestimate the actual number of log entries.
As a result, we might be at the risk of overtrimming log entries.

https://github.com/ceph/ceph/pull/18338 reveals a probably easier way
to fix the above problem but unfortunately it also can cause big performance regression
and hence comes this pr..

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 3654d56985c67d15506fa37b56ef5b0c04e01a65)

Conflicts:
src/osd/PrimaryLogPG.cc: trivial resolution

commit | commitdiff | tree

xie xingguo [Mon, 3 Sep 2018 07:37:36 +0000 (15:37 +0800)]

osd/PrimaryLogPG: avoid dereferencing invalid complete_to

For the auto-repair (EIO caused) case, we will not reinitialize
**complete_to** (because last_complete is equal to last_update!)
and hence there is chance that **complete_to** should aleady
point to **log.end()** before we call recover_got.

We could simply drop it here as we (already) logged the **complete_to**
iterator change in a more compatible way a few lines below.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 69a2cc35840939436da09691ca62476d7f599de4)

commit | commitdiff | tree

Neha Ojha [Thu, 16 Aug 2018 18:48:19 +0000 (11:48 -0700)]

osd/PrimaryLogPG.cc: limit trimming at can_rollback_to

This change is motivated by the failures seen in the multimds suite,
where we hit assert(s <= can_rollback_to), while trimming the log in ec
pools.

This is due to the fact that we had removed limits on the trim_to value to
address https://tracker.ceph.com/issues/23979.

But, seems that this could be dangerous for ec pools. So, keep the
can_rollback_to limit, while calculating the trim_to value.

Fixes: http://tracker.ceph.com/issues/21416
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 4b5c6b88d444e2173e716fe4890717873c8dc8e5)

commit | commitdiff | tree

Neha Ojha [Sat, 4 Aug 2018 00:38:22 +0000 (17:38 -0700)]

osd/PGLog.cc: check if complete_to points to log.end()

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 630daa1d7d233aa01a2d49bf1bc7b858bf73348a)

commit | commitdiff | tree

Neha Ojha [Tue, 31 Jul 2018 00:09:51 +0000 (17:09 -0700)]

src/osd/PG.cc: remove redundant call to trim_log()

This change is motived by the failure tracked in
https://tracker.ceph.com/issues/25198. The failure highlights a case, when a
call to trim_log() after the PG has recovered, races with the previous op,
on a replica OSD. Since the previous operation has not completed, the
last_complete value for that OSD is not valid, when we try to trim the
log. It is also worth noting that the race is due to MOSDPGTrim going through
the strict queue as a peering message vs regular ops going through the
non-strict queue.

During the investigation of this bug, we noticed that, with
https://tracker.ceph.com/issues/23979, we allow pg log trimming to
happen on the primary and replicas, whenever we cross the upper bound of
the pg log. This also ensures that pg log trimming happens while processing
any new op.

Therefore, the function trim_log(), which earlier served the purpose of
trimming logs on the primary and replicas, just before the PG went into
the Recovered state, is no more required. This acted like a last line of
defense to trim logs, when we did not need the logs any more. But, this call
seems redundant now, because, we are limiting the pg log length at all times.

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 283b0bde4a52128c1590afe8e5011b266a2e334b)

Conflicts:
src/osd/PG.cc: We do not need the trim_log() call any longer,
due to the explanation provided in the commit message.

commit | commitdiff | tree

Neha Ojha [Mon, 30 Jul 2018 23:42:55 +0000 (16:42 -0700)]

osd/PGLog.cc: use lgeneric_subdout instead of generic_dout

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 1b6dafb351b4f80e1e2c9a304f9e1d508ae8bf72)

commit | commitdiff | tree

Neha Ojha [Tue, 17 Jul 2018 01:11:27 +0000 (18:11 -0700)]

osd/PGLog: allow pg log trim when complete_to is less than trim_to

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit a5329ba8dd169e55deaff47d042354e53d8e722d)

Conflicts:
src/osd/PGLog.cc: Now it is possible to have complete_to version
        less than or equal to trim version, because the pg log length upper
        limit is a hard limit, and trim can proceed even when there is
        pending recovery/backfill. So do not complain when this happens.

commit | commitdiff | tree

Neha Ojha [Tue, 17 Jul 2018 01:01:26 +0000 (18:01 -0700)]

osd: reset complete_to when trimming the log past it

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 38170cdb1b8c3ea7e8b411fabe1fe99abd06cf52)

commit | commitdiff | tree

Neha Ojha [Mon, 16 Jul 2018 23:48:58 +0000 (16:48 -0700)]

osd: allow trim() to proceed when there are missing items

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit de42fee0dff299f4d0377961d05e02fd8f49f21b)

Conflicts:
src/osd/PGLog.cc: The async recovery feature is not present
in luminous. Remove async recovery requirements from this commit.

commit | commitdiff | tree

Neha Ojha [Mon, 16 Jul 2018 23:31:22 +0000 (16:31 -0700)]

osd: handle trim() during backfill

Remove async recovery components: The async recovery feature is not present
in luminous. We do not need commit 22d17fb5aad6ab9d7525d9492c0e96a36d02879e,
which adds a flag to remember async recovery. We have also removed async
recovery requirements from this commit and modified the commit message to
only reflect backfill.

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit e538c31f0f3133f811a8e478fcb25b575cad66bf)

commit | commitdiff | tree

Neha Ojha [Mon, 16 Jul 2018 22:06:12 +0000 (15:06 -0700)]

osd: print pg log length and trim_to

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit f48584a5b468949c31bffba1b507fb13a8756284)

commit | commitdiff | tree

Neha Ojha [Mon, 16 Jul 2018 21:46:21 +0000 (14:46 -0700)]

osd: make calc_trim_to() independent of min_last_complete_ondisk

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 1ae5fd32c92ea2f025344c663535d00f71f2cdda)

Conflicts:
src/osd/PrimaryLogPG.cc: min_last_complete_ondisk and
pg_log.get_can_rollback_to() are no longer the limit of the pg log.
Make the head of the pg log the new limit for pg log trimming.

commit | commitdiff | tree

Yuri Weinstein [Fri, 18 Jan 2019 15:47:26 +0000 (07:47 -0800)]

Merge pull request #25833 from vshankar/wip-37762

luminous: config: drop config::lock when invoking config observer

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Volker Theile [Thu, 29 Nov 2018 12:48:30 +0000 (13:48 +0100)]

ceph-volume: Adapt code to support Python3

- raw_input() has been renamed to input() in Python3
- Changed signature of prompt_bool. Variables that are named like built-ins must be named like xxx_ and not _xxx

Fixes: https://tracker.ceph.com/issues/37470
Signed-off-by: Volker Theile <vtheile@suse.com>
(cherry picked from commit fe25a0ea625e75c598f6d0749e7259eef167fa8e)

commit | commitdiff | tree

Alfredo Deza [Fri, 7 Dec 2018 17:29:45 +0000 (12:29 -0500)]

ceph-volume tests.functional.batch symlink test_zap to all batch scenarios

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit d3c0058911fd0b3e273a7afb1a9f4f6072942d75)

commit | commitdiff | tree

Alfredo Deza [Fri, 7 Dec 2018 17:28:37 +0000 (12:28 -0500)]

ceph-volume tests.functional.batch add test_zap yaml to tox.ini

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 25843b762953e53c08e6475d1f9dbfd5102a283f)

commit | commitdiff | tree

Alfredo Deza [Fri, 7 Dec 2018 17:08:44 +0000 (12:08 -0500)]

ceph-volume tests.functional.batch create a separate test_zap playbook

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 6b8f0d483a6203de204fcc605ade536749a9f6c8)

commit | commitdiff | tree

Alfredo Deza [Fri, 7 Dec 2018 12:53:43 +0000 (07:53 -0500)]

ceph-volume lvm.zap update success message for OSD IDs

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 73bb17f79a5b67b28ceace9dc7dcc2dde38dad5e)

commit | commitdiff | tree

Alfredo Deza [Thu, 6 Dec 2018 20:30:38 +0000 (15:30 -0500)]

doc/man/ceph-volume add zapping by osd-id examples

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit acbf7c9b2ad2810621d7ed3987e814cd8c2a2f65)

commit | commitdiff | tree

Alfredo Deza [Thu, 6 Dec 2018 20:30:22 +0000 (15:30 -0500)]

doc/ceph-volume add zapping by osd-id examples

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit f82304f4b5f5ecd865b6a208f894b59cec6386e2)

commit | commitdiff | tree

Alfredo Deza [Thu, 6 Dec 2018 20:00:00 +0000 (15:00 -0500)]

ceph-volume tests.lvm verify associated lvs by osd id+fsid behavior

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 40a414464727375d5c558532aed3b70766ec5b78)

commit | commitdiff | tree

Alfredo Deza [Thu, 6 Dec 2018 15:54:11 +0000 (10:54 -0500)]

ceph-volume lvm.zap initial take on zapping by OSD ID, FSID

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 59f6cec6dd0687a2528f75a518bd0d4617c27fe0)

commit | commitdiff | tree

Patrick Donnelly [Wed, 16 Jan 2019 18:52:09 +0000 (10:52 -0800)]

qa: silence read-only WRN for damage testing

Fixes: http://tracker.ceph.com/issues/37944
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit ac302de7b7725c259fee1852e3f402533d69f8cf)

commit | commitdiff | tree

Yuri Weinstein [Thu, 17 Jan 2019 18:06:56 +0000 (10:06 -0800)]

Merge pull request #25719 from ashishkumsingh/wip-37553-luminous

luminous: osdc/Objecter: update op_target_t::paused in _calc_target

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>

commit | commitdiff | tree

Yuri Weinstein [Wed, 16 Jan 2019 20:31:23 +0000 (12:31 -0800)]

Merge pull request #25967 from batrick/i37922

luminous: qa: test_damage fixes

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 16 Jan 2019 20:30:52 +0000 (12:30 -0800)]

Merge pull request #25968 from batrick/i37899

luminous: mds: purge queue recovery hangs during boot if PQ journal is damaged

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 16 Jan 2019 13:03:19 +0000 (05:03 -0800)]

Merge pull request #25762 from pdvian/wip-37635-luminous

luminous: cephfs: race of updating wanted caps

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 16 Jan 2019 13:02:35 +0000 (05:02 -0800)]

Merge pull request #25826 from Vicente-Cheng/wip-37092-luminous

luminous: mds: "src/mds/MDLog.cc: 281: FAILED ceph_assert(!capped)" during max_mds thrashing

Reviewed-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Alfredo Deza [Mon, 17 Dec 2018 18:34:58 +0000 (13:34 -0500)]

ceph-volume lvm.strategies remove unused import

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit d127ae3e5901c6477ed44f4d50be834040c44b6f)

Conflicts:
src/ceph-volume/ceph_volume/devices/lvm/strategies/bluestore.py
resolved by removing str_to_int import.

commit | commitdiff | tree

Jan Fajerski [Mon, 26 Nov 2018 13:03:56 +0000 (14:03 +0100)]

ceph-volume: filestore strategy use strategy class hierarchy

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 34df12ea6a8686f161b4bdfde92e557743f59cf4)

commit | commitdiff | tree

Jan Fajerski [Mon, 26 Nov 2018 13:03:03 +0000 (14:03 +0100)]

ceph-volume: bluestore strategy use strategy class hierarchy

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 302334ac6a4375125b6c1e68bf7160f8bc0597c2)

commit | commitdiff | tree

Jan Fajerski [Mon, 26 Nov 2018 13:01:57 +0000 (14:01 +0100)]

ceph-volume: add strategies.py to for shared code

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit b38987ee9652527e4845f2beb6406cbd398dac9b)

commit | commitdiff | tree

Yuri Weinstein [Wed, 16 Jan 2019 00:02:57 +0000 (16:02 -0800)]

Merge pull request #25384 from ifed01/wip-ifed-fix2-expand-luminous

luminous: core: os/bluestore_tool: fix bluefs expand

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Tue, 15 Jan 2019 23:29:22 +0000 (15:29 -0800)]

Merge pull request #25965 from yuriw/wip-yuriw-p2p-luminous

luminous: qa/tests: added v12.2.9 and v12.2.10 to the mix

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Sun, 23 Dec 2018 22:22:49 +0000 (14:22 -0800)]

mds: allow boot on read-only

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit c7ce967b778a0b86b335f6801301e484aaf6ebc3)

Conflicts:
src/mds/MDSRank.cc

commit | commitdiff | tree

Patrick Donnelly [Tue, 18 Dec 2018 23:11:02 +0000 (15:11 -0800)]

mds: setup readonly mode for PurgeQueue

If the PQ faces an error, it should go read-only along with the MDS rank.

Fixes: http://tracker.ceph.com/issues/37543
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 4cccc4dffb0915ef9e7d3b446e9a32f277646562)

Conflicts:
src/mds/PurgeQueue.cc

commit | commitdiff | tree

Patrick Donnelly [Tue, 18 Dec 2018 23:08:11 +0000 (15:08 -0800)]

mds: add missing locks for PurgeQueue methods

These could race with the asynchronous workings of the PQ.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit c7350ac23c73867b52cd9b7bb23b6c618eebe44d)

Conflicts:
src/mds/PurgeQueue.cc

commit | commitdiff | tree

Patrick Donnelly [Tue, 18 Dec 2018 22:00:29 +0000 (14:00 -0800)]

mds: delete on_error context on des

Otherwise it leaks.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 33279822eabb64380f5968cc645735a8f99a3ac1)

commit | commitdiff | tree

Patrick Donnelly [Wed, 9 Jan 2019 00:26:14 +0000 (16:26 -0800)]

qa: fix damage expectation setting

The purge queue expectation was being ignored.

Fixes: http://tracker.ceph.com/issues/37837
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 360550b1ab143bb59fd38c77569f7f5ef762801a)

commit | commitdiff | tree

Patrick Donnelly [Tue, 8 Jan 2019 23:51:53 +0000 (15:51 -0800)]

qa: fix loop variable reference

Otherwise the Mutation for Truncate is done on obj_id of the last iteration of the previous loop.

Fixes: http://tracker.ceph.com/issues/37836
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 909d0f1333a730c648c08cda047bb5f356a61986)

commit | commitdiff | tree

Yuri Weinstein [Tue, 15 Jan 2019 17:37:17 +0000 (09:37 -0800)]

qa/tests: added v12.2.9 and v12.2.10 to the mix

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>

commit | commitdiff | tree

Josh Durgin [Mon, 14 Jan 2019 21:12:58 +0000 (13:12 -0800)]

Merge branch 'luminous' into wip-ifed-fix2-expand-luminous

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 22:05:44 +0000 (14:05 -0800)]

Merge pull request #25829 from badone/wip-examples-link-order-fix

luminous: examples: fix link order in librados example Makefile

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 22:05:03 +0000 (14:05 -0800)]

Merge pull request #25845 from pdvian/wip-37811-luminous

luminous: mon/OSDMonitor: do not populate void pg_temp into nextmap

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 22:04:30 +0000 (14:04 -0800)]

Merge pull request #25847 from pdvian/wip-37813-luminous

luminous: mon: shutdown messenger early to avoid accessing deleted logger

commit | commitdiff | tree

Alfredo Deza [Sat, 12 Jan 2019 20:19:31 +0000 (15:19 -0500)]

Merge pull request #25922 from alfredodeza/luminous-ceph-volume-fix-json

luminous ceph-volume: fix JSON output in `inventory`

Reviewed-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:09:05 +0000 (16:09 -0800)]

Merge pull request #25677 from joscollin/wip-37700-luminous

luminous: mds: fix bug filelock stuck at LOCK_XSYN leading client can't read data

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:08:24 +0000 (16:08 -0800)]

Merge pull request #25682 from joscollin/wip-37633-luminous

luminous: mds: remove duplicated l_mdc_num_strays perfcounter set

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:08:07 +0000 (16:08 -0800)]

Merge pull request #25684 from joscollin/wip-37631-luminous

luminous: client: do not move f->pos untill success write

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:07:44 +0000 (16:07 -0800)]

Merge pull request #25686 from joscollin/wip-37737-luminous

luminous: MDSMonitor: allow beacons from stopping MDS that was laggy

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:07:22 +0000 (16:07 -0800)]

Merge pull request #25695 from joscollin/wip-37625-luminous

luminous: pybind/mgr/status: fix ceph fs status in py3 environments.

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:06:57 +0000 (16:06 -0800)]

Merge pull request #25696 from joscollin/wip-36502-luminous

luminous: qa: increase timeout for cleanup

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:06:30 +0000 (16:06 -0800)]

Merge pull request #25779 from pdvian/wip-37694-luminous

luminous: mon: mark REMOVE_SNAPS messages as no_reply

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:05:56 +0000 (16:05 -0800)]

Merge pull request #25784 from ukernel/luminous-37739

luminous: extend reconnect period when mds is busy

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:05:28 +0000 (16:05 -0800)]

Merge pull request #25805 from ashishkumsingh/wip-36504-luminous

luminous: qa: use timeout for fs asok operations

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:00:21 +0000 (16:00 -0800)]

Merge pull request #25928 from neha-ojha/wip-whitelist-slow-request

luminous: qa/tasks/thrashosds-health.yaml: whitelist 'slow request'

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Neha Ojha [Fri, 11 Jan 2019 23:19:27 +0000 (18:19 -0500)]

qa/tasks/thrashosds-health.yaml: whitelist 'slow request'

https://github.com/ceph/ceph/pull/25824 adds slow request to OSD logs.
To deal with it, whitelist 'slow request' instead of 'slow requests'.
This PR is specific to luminous because later versions whitelist it correctly.

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Sebastian Wagner [Thu, 22 Nov 2018 17:01:50 +0000 (18:01 +0100)]

ceph-volume: fix JSON output in `inventory`

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
(cherry picked from commit a3e6f569b4fa0419dff4690a72e9be6fe0a255c1)

commit | commitdiff | tree

Yuri Weinstein [Fri, 11 Jan 2019 13:44:35 +0000 (05:44 -0800)]

Merge pull request #25889 from pdvian/wip-37820-luminous

luminous: mds: create heartbeat grace config option

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 11 Jan 2019 13:43:42 +0000 (05:43 -0800)]

Merge pull request #25890 from vshankar/wip-purge-single-mds-multifs-test

luminous: qa: remove single mds yaml for cephfs multifs test

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 11 Jan 2019 13:42:43 +0000 (05:42 -0800)]

Merge pull request #25431 from batrick/i37540

luminous: mds: obsolete MDSMap option configs

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 11 Jan 2019 13:39:55 +0000 (05:39 -0800)]

Merge pull request #25824 from neha-ojha/wip-1659156-luminous

luminous: osd/OSD.cc: log slow requests in OSD logs

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

IvanGuan [Fri, 4 Jan 2019 04:22:27 +0000 (12:22 +0800)]

client: fix fuse client hang because its pipe to mds is not ok

If fuse client session had been killed by mds and the mds daemon restart
or hot-standby switch happens right away but the client did not receive
any message from monitor due to network or other whatever reason untill
the mds become active again.Thus cause client didn't do closed_mds_session
lead the seession still is STATE_OPEN but client can't send any message to
mds because its pipe is not ok.So we should close the stale session so that
it can be reopened again.

Fixes: http://tracker.ceph.com/issues/36079
Signed-off-by: Guan yunfei <yunfei.guan@xtaotech.com>
(cherry picked from commit 0e137de26e85942f8b40f7b13e564bd4c31b37f9)

Conflicts:
src/client/Client.cc : Resolved in handle_mds_map

commit | commitdiff | tree

Casey Bodley [Mon, 10 Dec 2018 17:38:01 +0000 (12:38 -0500)]

rgw: sanitize customer encryption keys from log output in v4 auth

Fixes: http://tracker.ceph.com/issues/37847
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit ba55e2a96c9dfcc7aa2311431beaaa23cb05c30d)

commit | commitdiff | tree

Abhishek Lekshmanan [Mon, 10 Dec 2018 23:30:46 +0000 (00:30 +0100)]

rgw: mimic gconf changes

As the largeish change from master g_conf() isn't in mimic yet, use the g_conf
global structure, also make rgw_op use the value from req_info ceph context as
we do for all the requests

Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
(cherry picked from commit 01d647310ec2d7d423df1196eb2a7aef685d832e)

Unnamed repository; edit this file 'description' to name the repository.