qa: fix misleading "in cluster log" failures during cluster log scan
Summary:
Fix misleading failure reasons reported as `"… in cluster log"` when
no such log entry actually exists.
The cephadm task currently treats `grep` errors from the cluster log
scan as if they were actual log matches. This can produce bogus
failure summaries when `ceph.log` is missing, especially after early
failures such as image pull or bootstrap problems.
Problem:
first_in_ceph_log() currently:
- returns stdout if a match is found
- otherwise returns stderr
The caller then treats any non-None value as a real cluster log hit and formats it as:
"<value>" in cluster log
That means an error like:
grep: /var/log/ceph/<fsid>/ceph.log: No such file or directory
can be misreported as if it came from the cluster log.
This change makes cluster log scanning robust and accurate by:
- checking whether /var/log/ceph/<fsid>/ceph.log exists before scanning
- using check_status=False for the grep pipeline
- treating only stdout as a real log match
- treating stderr as a scan error instead of log content
- avoiding overwrite of a more accurate pre-existing failure_reason
- reporting scan failures separately as cluster log scan failed