The calls to journaler->is_readable() and journaler->get_error()
in MDLog::_replay_thread() will drop Journaler::lock between
invocations, so, theoretically, its possible that the initial check:
// loop
int r = 0;
while (1) {
// wait for read?
while (!journaler->is_readable() &&
journaler->get_read_pos() < journaler->get_write_pos() &&
!journaler->get_error()) {
C_SaferCond readable_waiter;
journaler->wait_for_readable(&readable_waiter);
r = readable_waiter.wait();
}
if (journaler->get_error()) {
r = journaler->get_error();
dout(0) << "_replay journaler got error " << r << ", aborting" << dendl;
journaler->is_readable() returned true, thereby breaking out of
the (inner) while loop and by passing the journaler->get_error()
check, and by the time this hits the next set of checks:
if (!journaler->is_readable() &&
journaler->get_read_pos() == journaler->get_write_pos())
break;
ceph_assert(journaler->is_readable() || mds->is_daemon_stopping());
It's possible that the journal is unreadable due to some error that
happened during prefetch. In short, these checks are racy.
So, remove these racy assert check along with journaler->is_readable()
check when validating the journal end and rely on the next iteration
of reading the journal for error handling.
Fixes: http://tracker.ceph.com/issues/57048
Signed-off-by: Venky Shankar <vshankar@redhat.com>
break;
}
- if (!journaler->is_readable() &&
- journaler->get_read_pos() == journaler->get_write_pos())
+ if (journaler->get_read_pos() == journaler->get_write_pos()) {
+ dout(10) << "_replay: read_pos == write_pos" << dendl;
break;
-
- ceph_assert(journaler->is_readable() || mds->is_daemon_stopping());
+ }
// read it
uint64_t pos = journaler->get_read_pos();