Nathan Cutler [Fri, 26 Feb 2016 17:30:49 +0000 (18:30 +0100)]
packaging: lsb_release build and runtime dependency
The lsb_release executable is being run in multiple places, not least in
src/common/util.cc, which calls it via shell in the collect_sys_info() code
path.
This patch addresses this issue on SUSE- and Debian-derivatives, as well
as reinstating the dependency for RHEL/Fedora after it was dropped in 15600572265bed397fbd80bdd2b7d83a0e9bd918.
Conflicts:
ceph.spec.in
The jewel specfile has diverged considerably from hammer:
systemd, package split, etc. This is more of a hand backport
than a cherry-pick.
Conflicts:
src/test/crypto.cc : complements the incorrect cherry-pick df3f971eafda9c54881c13dcf47f996f18e17028 see
http://tracker.ceph.com/issues/14863 for more information
This Op would change the tmp osd_primary_affinity, but the osd_primary_affinity
is declared as ceph::shared_ptr, so this would change the osdmap too. When this
round encode_pending is proposed fail. We may call encode_pending again, but the
osdmap is changed last round, so the pending_inc would be wrong.
[backport] rgw: Make RGW_MAX_PUT_SIZE configurable
The 5GB limit of a single operation uploading was part of S3 spec.
However some private setups may have some special requirements
on this limit. It's more convinent to have a configurable value.
Closes: http://tracker.ceph.com/issues/14569
(cherry picked from commit df97f28)
Fixes: #14637 Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
(cherry picked from commit ec162f068b40f594c321df5caa9fe2541551b89e)
Cherry-pick to hammer includes nroff source change (in master the
nroff sources are no longer present in Git.)
Greg Farnum [Wed, 13 Jan 2016 21:17:53 +0000 (13:17 -0800)]
fsx: checkout old version until it compiles properly on miras
I sent a patch to xfstests upstream at
http://article.gmane.org/gmane.comp.file-systems.fstests/1665, but
until that's fixed we need a version that works in our test lab.
Jason Dillaman [Fri, 14 Aug 2015 17:28:13 +0000 (13:28 -0400)]
WorkQueue: PointerWQ drain no longer waits for other queues
If another (independent) queue was processing, drain could
block waiting. Instead, allow drain to exit quickly if
no items are being processed and the queue is empty for
the current WQ.
Loic Dachary [Fri, 18 Dec 2015 16:03:21 +0000 (17:03 +0100)]
ceph-disk: use blkid instead of sgdisk -i
sgdisk -i 1 /dev/vdb opens /dev/vdb in write mode which indirectly
triggers a BLKRRPART ioctl from udev (starting version 214 and up) when
the device is closed (see below for the udev release note). The
implementation of this ioctl by the kernel (even old kernels) removes
all partitions and adds them again (similar to what partprobe does
explicitly).
The side effects of partitions disappearing while ceph-disk is running
are devastating.
sgdisk is replaced by blkid which only opens the device in read mode and
will not trigger this unexpected behavior.
The problem does not show on Ubuntu 14.04 because it is running udev <
214 but shows on CentOS 7 which is running udev > 214.
git clone git://anonscm.debian.org/pkg-systemd/systemd.git
systemd/NEWS:
CHANGES WITH 214:
* As an experimental feature, udev now tries to lock the
disk device node (flock(LOCK_SH|LOCK_NB)) while it
executes events for the disk or any of its partitions.
Applications like partitioning programs can lock the
disk device node (flock(LOCK_EX)) and claim temporary
device ownership that way; udev will entirely skip all event
handling for this disk and its partitions. If the disk
was opened for writing, the close will trigger a partition
table rescan in udev's "watch" facility, and if needed
synthesize "change" events for the disk and all its partitions.
This is now unconditionally enabled, and if it turns out to
cause major problems, we might turn it on only for specific
devices, or might need to disable it entirely. Device Mapper
devices are excluded from this logic.
Conflicts:
src/ceph-disk: keep get_partition_type as it is because
some hammer users may rely on the fact that it is able
to fallback to sgdisk if blkid is old. Chances are an
old blkid also means an old udev that does not have the
problem this fix is adressing. The get_partition_uuid
is modified to try blkid first, with the same rationale.
Zhi Zhang [Mon, 1 Feb 2016 03:03:30 +0000 (11:03 +0800)]
[ceph-fuse] fix ceph-fuse writing to stale log file after log rotation
This fix should be applied to hammer branch. It can't be directly applied to master branch, because logrotate.conf is changed on matser since ceph-osd, ceph-mon, etc, is controlled by systemd with user/group as 'ceph' by default, while ceph-fuse might be started as root privilege by external users.
Kefu Chai [Thu, 28 Jan 2016 10:09:53 +0000 (02:09 -0800)]
mon: compact full epochs also
by compacting the ${prefix}.${start}..${prefix}..${end} does not
necessary compact the range of ${prefix}."full_"${start}..
${prefix}."full_"${end}. so when more and more epochs get trimmed
with out a full range compaction, the size of monitor store could
be very large.
ReplicatedPG::prepare_transaction(): check if the pool is full before
updating the cached ObjectContext to avoid the discrepancy between
the cached and the actual object size (and other metadata).
While at it improve the check itself: consider cluster full flag,
not just the pool full flag, also consider object count changes too,
not just bytes.
Conflicts:
src/osd/ReplicatedPG.cc
code section was moved to ReplicatedPG::maybe_promote
in master. Signed-off-by: Robert LeBlanc <robert.leblanc@endurance.com>
Sage Weil [Wed, 25 Nov 2015 19:39:08 +0000 (14:39 -0500)]
osd/ReplicatedPG: fix promotion recency logic
Recency is defined as how many of the last N hitsets an object
must appear in in order to be promoted. The previous logic did
nothing of the sort... it checked for the object in any one of
the last N hitsets, which led to way to many promotions and killed
any chance of the cache performing properly.
While we are here, we can simplify the code to drop the max_in_*
fields (no longer necessary).
Note that we may still want a notion of 'temperature' that does
tolerate the object missing in one of the recent hitsets.. but
that would be different than recency, and should probably be
modeled after the eviction temperature model.
Backport: infernalis, hammer Reported-by: Nick Fisk <nick@fisk.me.uk> Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 180c8743addc5ae2f1db9c58cd2996ca6e7ac18b)
Conflicts:
src/osd/ReplicatedPG.cc
code section was moved to ReplicatedPG::maybe_promote
in master. Signed-off-by: Robert LeBlanc <robert.leblanc@endurance.com>
man/rados.8: also added the rendered man.8 man page, as we don't
put the generated man pages in master anymore. but
they are still in the hammer's source repo.
Douglas Fuller [Fri, 22 Jan 2016 19:18:40 +0000 (11:18 -0800)]
rbd: remove canceled tasks from timer thread
When canceling scheduled tasks using the timer thread, TaskFinisher::cancel
does not call SafeTimer::cancel_event, so events fire anyway. Add this call.
this command repeatly add the latest pgmap to the monstore in order
to inflate it. the command helps with the testing of some monstore
related performance issue of monitor
Kefu Chai [Fri, 19 Jun 2015 14:57:57 +0000 (22:57 +0800)]
tools/ceph-monstore-tools: add rewrite command
"rewrite" command will
- add a new osdmap version to update current osdmap held by OSDMonitor
- add a new paxos version, as a proposal it will
* rewrite all osdmap epochs from specified epoch to the last_committed
one with the specified crush map.
* add the new osdmap which is added just now
so the leader monitor can trigger a recovery process to apply the transaction
to all monitors in quorum, and hence bring them back to normal after being
injected with a faulty crushmap.
Ken Dreyer [Mon, 18 Jan 2016 15:24:46 +0000 (08:24 -0700)]
osd: disable filestore_xfs_extsize by default
This option involves a tradeoff: When disabled, fragmentation is worse,
but large sequential writes are faster. When enabled, large sequential
writes are slower, but fragmentation is reduced.
Loic Dachary [Fri, 29 Jan 2016 03:36:05 +0000 (10:36 +0700)]
Merge pull request #7316 from ceph/wip-deb-lttng-hammer
deb: strip tracepoint libraries from Wheezy/Precise builds
All other "modern" Debian-based OSes have a functional LTTng-UST. Since only hammer needs to build on these older distros, this fix only affects the deb building process for those two releases(since autoconf detects that LTTng is broken).