Robin H. Johnson [Thu, 26 May 2016 22:41:20 +0000 (15:41 -0700)]
rgw: Fallback to Host header for bucket name.
RGW should fallback to using the Host header as the bucket name if valid &
possible even when it is NOT a suffix match against the DNS names, or a match
against the CNAME rule.
This mirrors AWS S3 behavior for these cases (The AWS S3 does not do any DNS
lookups regardless).
Backport: jewel Fixes: http://tracker.ceph.com/issues/15975 Signed-off-by: Robin H. Johnson <robin.johnson@dreamhost.com>
(cherry picked from commit 46aae19eeb91bf3ac78a94c9d4812a788d9042a8)
Ramana Raja [Thu, 7 Jul 2016 11:45:13 +0000 (17:15 +0530)]
ceph_volume_client: version on-disk metadata
Version on-disk metadata with two attributes,
'compat version', the minimum CephFSVolume Client
version that can decode the metadata, and
'version', the version that encoded the metadata.
Ramana Raja [Thu, 23 Jun 2016 10:36:53 +0000 (16:06 +0530)]
ceph_volume_client: modify locking of meta files
File locks are applied on meta files before updating the meta
file contents. These meta files would need to be cleaned up
sometime, which could lead to locks being held on unlinked meta
files. Prevent this by checking whether the file had been deleted
after lock was acquired on it.
Ramana Raja [Tue, 7 Jun 2016 19:12:18 +0000 (00:42 +0530)]
ceph_volume_client: recover from dirty auth and auth meta updates
Check dirty flag after locking something and call recover() if we are
opening something dirty (racing with another instance of the driver
restarting after failure) -- only required if someone running multiple
manila-share instances with Ceph loaded.
Ramana Raja [Tue, 21 Jun 2016 06:44:56 +0000 (12:14 +0530)]
ceph_volume_client: modify data layout in meta files
Notable changes to data layout in auth meta and volume meta files:
In the auth meta files, add a 'dirty' flag to track the status of auth
updates to a single volume.
In the volume meta file, make the 'dirty' flag track the status of
auth updates for a single ID.
Optimize the recovery of partial auth update changes to auth meta,
volume meta, and the Ceph backend, facilitated by changes in the
data layout in the meta files.
Store a two-way mapping between auth IDs and volumes.
Enables us to record some metadata on auth ids (which
openstack tenant created it) so that we can avoid exposing
keys to other tenants who try to use the same ceph
auth id.
Enables us to expose the list of which auth ids have access
to a volume, so that Manila's update_access() can be
implemented efficiently.
DNM: see TODOs inline.
Fixes: http://tracker.ceph.com/issues/15615 Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit d2e9eb55ca6ed5daa094cf323faf143615b9380b)
John Spray [Mon, 7 Mar 2016 13:06:41 +0000 (13:06 +0000)]
pybind: enable integer flags to libcephfs open
The 'rw+' style flags are handy and convenient, but
they don't capture all possibilities. Change to
optionally accept an integer here for advance users
who want to specify arbitrary combinations of
flags.
Noah Watkins [Wed, 16 Mar 2016 21:12:05 +0000 (14:12 -0700)]
buffer: fix iterator_impl visibility through typedef
The following program doesn't compile because of symbol visibility issues.
While bufferlist::iterator is a class implementation with visibility specified,
it is unclear after google-fu how to do the same through typedef.
int main()
{
ceph::bufferlist bl;
ceph::bufferlist::const_iterator it = bl.begin();
(void)it;
return 0;
}
[nwatkins@bender ~]$ g++ -Wall -std=c++11 -Iinstall/include -Linstall/lib -o test test.cc -lrados
/tmp/cciR9MUj.o: In function `main':
test.cc:(.text+0x43): undefined reference to `ceph::buffer::list::iterator_impl<true>::iterator_impl(ceph::buffer::list::iterator const&)'
/usr/bin/ld: test: hidden symbol `_ZN4ceph6buffer4list13iterator_implILb1EEC1ERKNS1_8iteratorE' isn't defined
/usr/bin/ld: final link failed: Bad value
collect2: error: ld returned 1 exit status
RGWDataSyncShardCR will only allocate an error_repo if it's doing
incremental sync, so RGWDataSyncSingleEntryCR needs to guard against a
null error_repo
also, RGWDataSyncShardCR::stop_spawned_services() was dropping the last
reference to the error_repo before calling drain_all(), which meant that
RGWDataSyncSingleEntryCR could still be holding a pointer. now uses a
boost::intrusive_ptr in RGWDataSyncSingleEntryCR to account for its
reference
Loic Dachary [Thu, 26 May 2016 07:38:47 +0000 (09:38 +0200)]
ceph-disk: partprobe should block udev induced BLKRRPART
Wrap partprobe with flock to stop udev from issuing BLKRRPART because
this is racy and frequently fails with a message like:
Error: Error informing the kernel about modifications to partition
/dev/vdc1 -- Device or resource busy. This means Linux won't know about
any changes you made to /dev/vdc1 until you reboot -- so you shouldn't
mount it or use it in any way before rebooting.
Opening a device (/dev/vdc for instance) in write mode indirectly
triggers a BLKRRPART ioctl from udev (starting version 214 and up)
when the device is closed (see below for the udev release note).
However, if udev fails to acquire an exclusive lock (with
flock(fd, LOCK_EX|LOCK_NB); ) the BLKRRPART ioctl is not issued.
Acquiring an exclusive lock before running the process that opens the
device in write mode is therefore an effective way to control this
behavior.
git clone git://anonscm.debian.org/pkg-systemd/systemd.git
systemd/NEWS:
CHANGES WITH 214:
* As an experimental feature, udev now tries to lock the
disk device node (flock(LOCK_SH|LOCK_NB)) while it
executes events for the disk or any of its partitions.
Applications like partitioning programs can lock the
disk device node (flock(LOCK_EX)) and claim temporary
device ownership that way; udev will entirely skip all event
handling for this disk and its partitions. If the disk
was opened for writing, the close will trigger a partition
table rescan in udev's "watch" facility, and if needed
synthesize "change" events for the disk and all its partitions.
This is now unconditionally enabled, and if it turns out to
cause major problems, we might turn it on only for specific
devices, or might need to disable it entirely. Device Mapper
devices are excluded from this logic.
Fixes: http://tracker.ceph.com/issues/15176 Signed-off-by: Marius Vollmer <marius.vollmer@redhat com> Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 8519481b72365701d01ee58a0ef57ad1bea2c66c)
pybind/ceph_argparse: handle non ascii unicode args
we raise UnicodeDecodeError at seeing non-ascii args if we fail to match
it with any command signatures. instead, we should use a unicode string
for representing the error in that case. please note, the exception is
not printed at all in real-world. =)
If a snaprealm has no child/parent snaprelam, and the snaprealm inode
is not in the cache while client reconnects. The snaprealm does not
get properly removed from MDCache::reconnected_snaplrealm. This causes
incorrect "unconnected snaprealm xxx" warning
Yan, Zheng [Tue, 28 Jun 2016 12:39:08 +0000 (20:39 +0800)]
client: unify cap flush and snapcap flush
This patch includes following changes
- assign flush tid to snapcap flush
- remove session's flushing_capsnaps list. add inode with snapcap
flushes to session's flushing_caps list instead.
- when reconnecting to MDS, re-send one inode's snapcap flushes and
cap flushes at the same time.
Yan, Zheng [Wed, 29 Jun 2016 09:15:01 +0000 (17:15 +0800)]
mds: handle partly purged directory
For a snapshoted direcotry whose snaprealm parents are being opened,
MDS does not know if the directory is purgeable. So MDS can't skip
committing dirfrags of the directory. But if the direcotry is purgeale,
some dirfrags could have already been deleted during MDS failover.
Committing them could return -ENOENT.