Discussion:
How to employ a mirror in this situation?
Harry Putnam
2014-04-05 03:27:04 UTC
Permalink
Running oi 151_a9 as vbox vm guest on win7 host.

I'm running a set of paired discs as mirrored pools. This is a
test/learning rig so nothing here is truly desperate.

I'm getting dire warnings from zpool status -v that I'm not
experienced enough to know what to do with. And even though the
warnings seem pretty serious I can't help but think its something less
serious, some bungling of mine somewhere that is giving a false alarm
or something of that nature.

I'll post the `zpool status' in a moment but first let me say that a
random sample of the files under the directories mentioned in the
output seem to indicate that the files at least work. Some are music
files and a lot more are photos.

I have viewed a small random sample of the photos and ditto for
playing the music. A few tested text files also seem to be readable
etc.

I'm puzzled by the fact too, that the error warnings do not specify
files but instead only directories. I haven't seen this kind of thing
before so not sure what all is kind of normal.

I went to the site listed in the output as the place to go for info:
http://illumos.org/msg/ZFS-8000-8A

One thing I noticed there was that the examples shown actually
referred to individual files... not directories.

Oh, and assuming the worst, and I do have to replace all the files
with backups.... can that be done from the companion mirroring disc?

I do have the same files available elsewhere, but would like to find
out what it really means to be mirroring the pools.

As I mentioned these are all dispensable files, but I'd like to carry
thru with whatever fixing is required and hopefully learn a bit about
how things are supposed to be done.
------- --------- ---=--- --------- --------
zfs list -r p4
NAME USED AVAIL REFER MOUNTPOINT
p4 258G 352G 31K /p/p4
p4/eBk 258G 352G 30.1M /eBk
p4/eBk/eImgMus 258G 352G 257G /eBk/eImgMus

------- --------- ---=--- --------- --------
# zpool status -v p4

pool: p4
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: none requested
config:

NAME STATE READ WRITE CKSUM
p4 ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c4t8d0 ONLINE 0 0 0
c4t9d0 ONLINE 0 0 0

errors: Permanent errors have been detected in the following files:

/p/p4/
/eBk/eImgMus/
Marion Hakanson
2014-04-05 20:44:59 UTC
Permalink
Hi Harry,

The "zpool status" shows your pool is mirrored, so you are already making
use of both drives in the mirror to correct errors as much as possible.
The fact that the errors were described as "permanent" means that for some
reason, ZFS was not able to repair them using the redundancy within the pool
(see below).

A directory is in reality just a special type of file, so it can experience
corruption or read/write errors just like any other file type. An error in
the contents of a directory doesn't necessarily indicate an error in the
contents of any of the files or subdirectories the directory points to,
however it my prevent you from getting to files (e.g. if a pointer to
such a file is corrupted or lost).

What you did (mentioned in a follow-on email in your openindiana-discuss
thread about this issue) was appropriate: Clear the error, intiate a
scrub, and see if the error returns. If not, you may be lucky and have
experienced a random, transient, unexplained glitch.

But the fact that you saw errors in two pools at around the same time
could indicate some flakiness in your hardware. It's not uncommon for
a weak power-supply, bad/loose/marginal cable, crummy disk controller,
marginal RAM without ECC, etc. to result in reoccurring corruption being
detected (and hopefully repaired) by ZFS. In this case, ZFS was unable
to repair the corruption, which probably indicates that the redundant
block(s) also showed errors. So maybe some single component like a
disk controller or RAM introduced the error onto both sides of the mirror.

So, if errors return, you've got marginal/bad hardware somewhere in your
system, and need to fix or replace things until the errors stay away.
In the case of your virtual system, maybe there's a glitch in one of
the virtualized (software emulated) parts of the puzzle. Have you got
VirtualBox configured to honor (not ignore) disk cache flushes, for example?

If I were you, I'd consider myself fortunate to have avoided silent,
unnoticed corruption of the data. Personally, if I have any choice,
I will not trust my data to any filesystem without some form of block-level
checksum or other error-detection capabilities.

Regards,

Marion


============================================================
Subject: [zfs] How to employ a mirror in this situation?
From: Harry Putnam <***@newsguy.com>
Date: Fri, 4 Apr 2014 23:27:04 -0400 (20:27 PDT)
To: <***@lists.illumos.org>

Running oi 151_a9 as vbox vm guest on win7 host.

I'm running a set of paired discs as mirrored pools. This is a
test/learning rig so nothing here is truly desperate.

I'm getting dire warnings from zpool status -v that I'm not
experienced enough to know what to do with. And even though the
warnings seem pretty serious I can't help but think its something less
serious, some bungling of mine somewhere that is giving a false alarm
or something of that nature.

I'll post the `zpool status' in a moment but first let me say that a
random sample of the files under the directories mentioned in the
output seem to indicate that the files at least work. Some are music
files and a lot more are photos.

I have viewed a small random sample of the photos and ditto for
playing the music. A few tested text files also seem to be readable
etc.

I'm puzzled by the fact too, that the error warnings do not specify
files but instead only directories. I haven't seen this kind of thing
before so not sure what all is kind of normal.

I went to the site listed in the output as the place to go for info:
http://illumos.org/msg/ZFS-8000-8A

One thing I noticed there was that the examples shown actually
referred to individual files... not directories.

Oh, and assuming the worst, and I do have to replace all the files
with backups.... can that be done from the companion mirroring disc?

I do have the same files available elsewhere, but would like to find
out what it really means to be mirroring the pools.

As I mentioned these are all dispensable files, but I'd like to carry
thru with whatever fixing is required and hopefully learn a bit about
how things are supposed to be done.
------- --------- ---=--- --------- --------
zfs list -r p4
NAME USED AVAIL REFER MOUNTPOINT
p4 258G 352G 31K /p/p4
p4/eBk 258G 352G 30.1M /eBk
p4/eBk/eImgMus 258G 352G 257G /eBk/eImgMus

------- --------- ---=--- --------- --------
# zpool status -v p4

pool: p4
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: none requested
config:

NAME STATE READ WRITE CKSUM
p4 ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c4t8d0 ONLINE 0 0 0
c4t9d0 ONLINE 0 0 0

errors: Permanent errors have been detected in the following files:

/p/p4/
/eBk/eImgMus/
Richard Elling
2014-04-05 22:30:06 UTC
Permalink
Post by Marion Hakanson
Hi Harry,
The "zpool status" shows your pool is mirrored, so you are already making
use of both drives in the mirror to correct errors as much as possible.
The fact that the errors were described as "permanent" means that for some
reason, ZFS was not able to repair them using the redundancy within the pool
(see below).
A directory is in reality just a special type of file, so it can experience
corruption or read/write errors just like any other file type. An error in
the contents of a directory doesn't necessarily indicate an error in the
contents of any of the files or subdirectories the directory points to,
however it my prevent you from getting to files (e.g. if a pointer to
such a file is corrupted or lost).
What you did (mentioned in a follow-on email in your openindiana-discuss
thread about this issue) was appropriate: Clear the error, intiate a
scrub, and see if the error returns. If not, you may be lucky and have
experienced a random, transient, unexplained glitch.
Scrub twice. The errors shown are for the current and previous periods.
-- richard

--

***@RichardElling.com
+1-760-896-4422












-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Harry Putnam
2014-04-06 16:47:14 UTC
Permalink
Marion Hakanson <***@ohsu.edu> writes:

[...]

Thanks for useful input.
Post by Marion Hakanson
A directory is in reality just a special type of file, so it can experience
corruption or read/write errors just like any other file type. An error in
the contents of a directory doesn't necessarily indicate an error in the
contents of any of the files or subdirectories the directory points to,
however it my prevent you from getting to files (e.g. if a pointer to
such a file is corrupted or lost).
Well, yes, I know directories are no more than tricky kinds of files
but still, I've looked at several sources of Docu about zpool status
output regarding corruption etc, and have yet to see examples that
show a directory by itself in the final lines of output. All I've
seen, which admittedly is precious few, referred to actual common
files.

I asked on both oi and illumos lists if it was at all peculiar to see
only directories in the final lines of output... but so far no one has
addressed that point specifically.

It is true that the oracle docu at:
http://docs.oracle.com/cd/E19253-01/819-5461/gbbwl/index.html

Mentions files AND directories.

In your experience, have you see zpool status -v output that showed
just directories and not any files in those final lines?
Post by Marion Hakanson
What you did (mentioned in a follow-on email in your openindiana-discuss
thread about this issue) was appropriate: Clear the error, intiate a
scrub, and see if the error returns. If not, you may be lucky and have
experienced a random, transient, unexplained glitch.
I didn't even clear... just ran scrub... and all talk of errors
disappeared.
Post by Marion Hakanson
But the fact that you saw errors in two pools at around the same time
could indicate some flakiness in your hardware. It's not uncommon for
a weak power-supply, bad/loose/marginal cable, crummy disk controller,
marginal RAM without ECC, etc. to result in reoccurring corruption being
detected (and hopefully repaired) by ZFS. In this case, ZFS was unable
to repair the corruption, which probably indicates that the redundant
block(s) also showed errors. So maybe some single component like a
disk controller or RAM introduced the error onto both sides of the mirror.
I won't bore you to tears here about what I may have done with vbox. I
did go into it at some length in the other thread on oi group if you are
interested.

Looking back I now think the problem was produced by my meddling with
the discs thru vbox. So may have some how fooled zfs into seeing a
problem.

[...]

Thanks again for the excellent input.
Richard Elling
2014-04-05 22:27:55 UTC
Permalink
Post by Harry Putnam
Running oi 151_a9 as vbox vm guest on win7 host.
I don't run virtualbox, but I do recall gnashing of teeth over the default
choices for disk emulation. See the topic and this thread
https://forums.virtualbox.org/viewtopic.php?f=2&t=20275

-- richard


--

***@RichardElling.com
+1-760-896-4422












-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Harry Putnam
2014-04-06 17:07:48 UTC
Permalink
Post by Richard Elling
Post by Harry Putnam
Running oi 151_a9 as vbox vm guest on win7 host.
I don't run virtualbox, but I do recall gnashing of teeth over the default
choices for disk emulation. See the topic and this thread
https://forums.virtualbox.org/viewtopic.php?f=2&t=20275
I'd be curious to know if anything has changed on that score... those
threads are a full 5yrs old. There may have been something newer in
there somewhere but it appeared the bulk of it occurred in Jan 2009,
making it 5.25 years old.

Loading...