First order of business: expand from 9x2TB to 9x4TB drives. Following refs such as:
http://www.itsacon.net/computers/unix/growing-a-zfs-pool/
I first check that all my drives are currently alive, the system is fully silvered, and then scrub to make doubly sure there are no errors.
My pool is raidz2, so in theory I could replace two disks at once, but obviously that leaves me vulnerable. Having dual parity allows me to swap one disk at a time, and still be protected again a disk failure mid-upgrade!
Start with "zpool status", which shows mediapool totally healthy.
"zpool scrub mediapool"
Says it'll take 50 hours to complete. Well, we wouldn't have gone out of our way to use ZFS unless we were paranoid about our data, so we'll suck it up and wait it out.
Okay, scrub completed successfully. Shut down, remove one of the 2TB drives and replace it with a 4TB drive. Verify at boot that the BIOS sees the 4TB drive.
media:1:~#zpool status
pool: mediapool
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://www.sun.com/msg/ZFS-8000-2Q
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
mediapool DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
c0t0d0 ONLINE 0 0 0
c0t1d0 ONLINE 0 0 0
c2t2d0 ONLINE 0 0 0
c2t3d0 ONLINE 0 0 0
c3t0d0 ONLINE 0 0 0
c3t1d0 ONLINE 0 0 0
c3t2d0 ONLINE 0 0 0
c3t3d0 ONLINE 0 0 0
390014080793031528 UNAVAIL 0 0 0 was /dev/dsk/c3t4d0s0
errors: No known data errors
#zpool replace mediapool c3t4d0s0
cannot open '/dev/dsk/c3t4d0s0': I/O error
cfgadm -s "select=type(disk)"
Ap_Id Type Receptacle Occupant Condition
sata0/0::dsk/c3t0d0 disk connected configured ok
sata0/1::dsk/c3t1d0 disk connected configured ok
sata0/2::dsk/c3t2d0 disk connected configured ok
sata0/3::dsk/c3t3d0 disk connected configured ok
sata0/4::dsk/c3t4d0 disk connected configured ok
sata1/0::dsk/c0t0d0 disk connected configured ok
sata1/1::dsk/c0t1d0 disk connected configured ok
sata2/2::dsk/c2t2d0 disk connected configured ok
sata2/3::dsk/c2t3d0 disk connected configured ok
So, the system sees c3t4d0. So then s0 does not exist probably. I can either replace with p0 to use the entire partition, or do a format to single slice and use s0. I don't recall why I used s0 before... probably so that I could upgrade some of the drives with larger ones with matching s0, then use s1 for extra space. Should be possible even with p0. Need to check my records.
Wait, that's not right. zpool replace should take a raw disk, format it, and resilver (according to http://docs.oracle.com/cd/E19082-01/817-2271/gbcet/index.html)
#zpool replace mediapool c3t4d0
cannot replace c3t4d0 with c3t4d0: device is too small
#format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c0t0d0
/pci@0,0/pci8086,3a46@1c,3/pci1458,b000@0/disk@0,0
1. c0t1d0
/pci@0,0/pci8086,3a46@1c,3/pci1458,b000@0/disk@1,0
2. c1d0
/pci@0,0/pci8086,244e@1e/pci-ide@0/ide@0/cmdk@0,0
3. c2t2d0
/pci@0,0/pci8086,244e@1e/pci1095,7124@1/disk@2,0
4. c2t3d0
/pci@0,0/pci8086,244e@1e/pci1095,7124@1/disk@3,0
5. c3t0d0
/pci@0,0/pci1458,b005@1f,2/disk@0,0
6. c3t1d0
/pci@0,0/pci1458,b005@1f,2/disk@1,0
7. c3t2d0
/pci@0,0/pci1458,b005@1f,2/disk@2,0
8. c3t3d0
/pci@0,0/pci1458,b005@1f,2/disk@3,0
9. c3t4d0
/pci@0,0/pci1458,b005@1f,2/disk@4,0
There's the problem; my drive, which shows up as 4000GB to BIOS, shows up as 1.64 TB within EON Solaris. Why? Poking around the net, it should be
It is an odd point that the perceived size is exactly 2TB less than the actual size. I'm running 64-bit solaris. The BIOS reports the drive as 4TB (need to double-check this is currently the case, as I tried a few different slots trying to get it to work)
Answer: although I was not able to find documentation of this, early versions of EON did not have support for 3TB+ drives completely in place. Need to update to recent version.
After updating to EON version 1.0b,
media:1:~#format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c0t0d0
/pci@0,0/pci8086,3a46@1c,3/pci1458,b000@0/disk@0,0
After updating to EON version 1.0b,
media:1:~#format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c0t0d0
/pci@0,0/pci8086,3a46@1c,3/pci1458,b000@0/disk@0,0
The drive's correct size is seen.
Check pool status
media:2:~#zpool status
pool: mediapool
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
scan: none requested
config:
NAME STATE READ WRITE CKSUM
mediapool DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
14963257236115845187 UNAVAIL 0 0 0 was /dev/dsk/c0t0d0s0
c0t1d0 ONLINE 0 0 0
c4t2d0 ONLINE 0 0 0
c4t3d0 ONLINE 0 0 0
c5t0d0 ONLINE 0 0 0
c5t1d0 ONLINE 0 0 0
c5t2d0 ONLINE 0 0 0
c5t3d0 ONLINE 0 0 0
c5t4d0 ONLINE 0 0 0
...
After backing all media up to UFS-formatted drives, I created a new pool from the 9 4TB drives and coping all the files back. I failed to upgrade the pool in-place, but did end up with all my data safe and sound in the new system.
Check pool status
media:2:~#zpool status
pool: mediapool
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
scan: none requested
config:
NAME STATE READ WRITE CKSUM
mediapool DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
14963257236115845187 UNAVAIL 0 0 0 was /dev/dsk/c0t0d0s0
c0t1d0 ONLINE 0 0 0
c4t2d0 ONLINE 0 0 0
c4t3d0 ONLINE 0 0 0
c5t0d0 ONLINE 0 0 0
c5t1d0 ONLINE 0 0 0
c5t2d0 ONLINE 0 0 0
c5t3d0 ONLINE 0 0 0
c5t4d0 ONLINE 0 0 0
and we're ready to replace
media:4:~#zpool replace mediapool c0t0d0
cannot replace c0t0d0 with c0t0d0: devices have different sector alignment
Crap.
According to this thread it is completely impossible to reaplce a 512k-sector drive with a 4k-sector drive. Oddly, it seems that ZFS will actually tolerate some level of mixage between aligments (and sector size?) at time of pool creation, or adding drives to a pool, but not when it comes time to replace a drive. The *only* option appears to be to back all the data up and create a fresh pool. A very unappealing prospect. I also means I will need 7x2TB drives.
Although the above thread seems authoritative, need to sleep on it and understand the situation a little better before taking action.
I have enough 2TB drives to make a complete backup, but some of them are in use in other ways. I might have a few which are already backups from an older state of the EON pool. I can overwrite those with the current pool contents. After that I have some matched master / backup pairs of data which is queued to enter pool.
Also worth noting: at time of writing, the cheapest 4tb drives ($210) are the same value as 2TB drives ($105) but could be saved for spares in the new pool. So any new drives purchased should be 4TB.
...
After backing all media up to UFS-formatted drives, I created a new pool from the 9 4TB drives and coping all the files back. I failed to upgrade the pool in-place, but did end up with all my data safe and sound in the new system.