Yehuda Sadeh [Wed, 12 Mar 2014 01:19:44 +0000 (18:19 -0700)]
rgw: don't overwrite bucket entry data when syncing user stats
Fixes: #7687
When syncing user bucket stats we overwritten the entire entry with the
passed in entry. We should only look at the stats portion, and not
overwrite the rest (which contains bucket creation time).
Image names buffer is fixed at 1024. This turns out to be not enough:
there are at least two "rbd-fuse rbd_list: error %d Numerical result
out of range" reports on the ML. Fix it by calling rbd_list() twice to
first get the expected buffer size. Also, get rid of the memory leak
and tweak the error message while at it.
Warren Usui [Fri, 21 Feb 2014 05:11:45 +0000 (21:11 -0800)]
Fix get_status() to find client.rados text inside of ps command results.
Added port (fixed value for right now in teuthology) to hostname. Fixes: 7374 Reviewed-by: Yehuda Sadeh <yehuda@inktank.com> Signed-off-by: Warren Usui <warren.usui@inktank.com>
(cherry picked from commit 8200b8a02511e367370d33cb74c3d45ef85fca31)
Yan, Zheng [Sun, 9 Mar 2014 23:36:14 +0000 (07:36 +0800)]
mds: fix owner check of file lock
flock and posix lock do not use process ID as owner identifier.
The process ID of who holds the lock is just for F_GETLK fcntl(2).
For linux kernel, File lock's owner identifier is the file pointer
through which the lock is requested.
The fix is do not take the 'pid_namespace' into consideration when
checking conflict locks. Also rename the 'pid' fields of struct
ceph_mds_request_args and struct ceph_filelock to 'owner', rename
'pid_namespace' fields to 'pid'.
The kclient counterpart of this patch modifies the flock code to
assign the file pointer to the 'owner' field of lock message. It
also set the most significant bit of the 'owner' field. We can use
that bit to distinguish between old and new clients.
Stephan Renatus [Mon, 10 Mar 2014 14:17:41 +0000 (15:17 +0100)]
rbdmap: bugfix upstart script
It seems like the upstart script is lacking a little behind [the initscript](https://github.com/ceph/ceph/blob/master/src/init-rbdmap#L44-L49); however, this bugfix makes it actually do what it should do.
Before, the bug made the job just ignore all parameters, with the following error in /var/log/upstart/rbdmap.log:
Samuel Just [Fri, 7 Mar 2014 23:54:23 +0000 (15:54 -0800)]
ReplicatedPG::finish_ctx: clear object_info if !obs.exists
Otherwise, we see a different object_info_t depending on whether the
transaction deleting the object clears before another op recreating it appears.
In particular, we use oi.version to set the prior_version on the log entries in
finish_ctx. If the oi is allowed to stick around the recreation log event will
have a prior version of the deletion event when it should have a prior version
of eversion_t().
Fixes: #7655 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Babu Shanmugam [Sat, 8 Mar 2014 05:17:13 +0000 (05:17 +0000)]
Broke down sysinfo's format into a histogram with a value and count
so that we just see how many of each version/distro/kernel/os/arch/cpu/etc are running
Looking for an entry in olog which matches one of ours might add
extra divergent entries. Instead, do what merge_log does and
walk back through the auth log looking for an entry in olog.
Fixes: 7657 Signed-off-by: Samuel Just <sam.just@inktank.com>
Yan, Zheng [Thu, 6 Mar 2014 23:12:39 +0000 (07:12 +0800)]
client: fix Client::getcwd()
An recent commit made MDS not include dentry trace in LOOKUPPARENT
reply. It broke Client::getcwd. The fix is change getcwd() to use
LOOKUPNAME MDS request
Yan, Zheng [Thu, 6 Mar 2014 07:24:02 +0000 (15:24 +0800)]
mds: introduce LOOKUPNAME MDS request
The new MDS request is used for connecting a given inode to its
parent inode. It allows client to have efficient implementation of
get_rename() NFS export callback.
Sage Weil [Fri, 7 Mar 2014 22:02:26 +0000 (14:02 -0800)]
mon/PGMap: send pg create messages to primary, not acting[0]
For erasure pools, these may not match.
In the case of #7652, this caused pg_create messages to be send
indefinitely. register_pg() added it to the list for acting_primary, and
when we got the (non-creating) pg stat update we removed it from the list
for acting[0].
Fixes: #7652 Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Fri, 7 Mar 2014 21:29:03 +0000 (13:29 -0800)]
mon/OSDMonitor: make osdmap feature checks non-racy
The check for OSD features may race with the boot of an OSD that does not
have the necessary features. Check the pending info too, and if there is
a missing feature, return -EAGAIN. In the callers, wait on -EAGAIN.
qa: workunits/mon/rbd_snaps_ops.sh: ENOTSUP on snap rm from copied pool
'rados cppool' copies the contents but that doesn't make the destination
pool an unmanaged snaps pool. Therefore, we must get an ENOTSUP when
we try to remove an unmanaged snap from a not-unmanaged pool.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
mon: OSDMonitor: don't remove unamanaged snaps from not-unmanaged pools
Although we should allow creating unmanaged snaps on not-unamanaged pools,
as long as those pools don't have any managed snapshots in them, we cannot
allow removal -- because the pool will not have any unmanaged snapshots.
Fixes: 7210 Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Sage Weil [Fri, 7 Mar 2014 00:12:30 +0000 (16:12 -0800)]
osd: fix agent thread shutdown
We had an old invariant that agent_queue would have at least 1 entry in
it to simplify some other code paths, but it turns out that it is simpler
not to do that.
In particular, this was triggering a failed assertion on shutdown when we
assert that the queue is empty.
Dump offending items on shutdown if they are there, tho, to catch any
future bugs.
Fixes: #7637 Signed-off-by: Sage Weil <sage@inktank.com>
Loic Dachary [Thu, 6 Mar 2014 23:07:26 +0000 (00:07 +0100)]
logrotate: copy/paste daemon list from *-all-starter.conf
Each upstart/*-all-starter.conf use the same script to find the list of
daemons and their ids. Copy it over to the corresponding logrotate.conf
script instead of using a less reliable script based on initctl list
output.
If logrotate fails to run initctl reload on a daemon, it will keep
writing to the rotated log file, even after it is deleted and until it
fills the disk. By using the exact same shell snippet as the upstart
scripts used to start the daemon, all of them will be sent the HUP
signal and reopen the log file that was just rotated.
Samuel Just [Thu, 6 Mar 2014 20:05:07 +0000 (12:05 -0800)]
ReplicatedPG: clean up num_dirty adjustments
Previously, a _delete_head() followed by a recreation on an object in
the same transaction would result in num_dirty being decremented in
_delete_head() without the flag being cleared. make_writeable() would
then see exists and was_dirty and therefore not increment num_dirty
resulting in a mismatch. Rather than trying to maintain the num_dirty
number in _delete_head(), rollback_to(), and make_writeable(), it seems
simpler to do the adjustment once in make_writeable based on undirty,
ctx->obc->obs.oi, and ctx->new_obs->oi.
Fixes: 7393 Signed-off-by: Samuel Just <sam.just@inktank.com>
Sage Weil [Wed, 5 Mar 2014 23:58:52 +0000 (15:58 -0800)]
mon/OSDMonitor: fix pool deletion checks, races
Unify the pool deletion safety checks into a single set of functions.
Make sure we check the committed state and error out if there is a problem.
Also check the pending state, if any, and delay+retry if there is a
problem there.
This ensures that we correctly verify that a pool is not in use when it
is deleted (by another tier or by cephfs). These checks are also now
applied to librados calls.
Fixes: #7590 Signed-off-by: Sage Weil <sage@inktank.com>