INTRODUCING ZFS - Alyseo

ZFS filesystem for storage ! The features of ZFS that make this file system unique are its abilities to unify...

ZFS filesystem for storage !

The features of ZFS that make this file system unique are its abilities to unify volume and filesystem management, to provide end-to-end data integrity, protection against silent data corruption (bit rot, phantom writes, DMA parity errors, driver bugs), infinite scalability and a copy-on-write transactional model.

ZFS uses a 128-bit addressing scheme and can store up to 275 billion TB per storage pool. ZFS capacity limits are so far away as to be unimaginable.

Features

Software RAIDs

ZFS offers software raids through its RAID-Z and MIRROR typologies. Software raids are more cost effective by eliminating proprietary hardware raid controllers.
When data is updated in a RAID stripe, the parity needs to be updated too.
But, there is no way to updated two or more disks atomically, causing a corrupted stripe during power outages. This is known as the RAID-5 write hole and ZFS is immune to it.

Scalability

Having a 128 bit filesystem, ZFS has no physical limitations regarding space expansion that can be surpassed. It uses a pooled storage approach, which can transparently grow the pool size at any time without any downtime or interruptions.

End to end Data Integrity

With ZFS each block is checksummed and this checksum is kept in a pointer to that block, not in the data block itself.
Data is checksummed all the way up the filesystem hierarchy up to the root node (the uberblock) which is also checksummed.
When data is read its checksum is calculated and compared to what it is suppose to be.
In case of mismatches a self-healing mechanisms is applied, repairing the blocks by considering its checksum result as well as other blocks that were written in the RAID-Z/MIRROR configuration.

Caching

ARC & L2ARC

The ARC is the “adaptive replacement cache”.
ARC is a very fast block level cache located in the systems memory.
Any read requests for data in the cache can be served directly from the ARC memory cache instead of hitting the much slower hard drives.
This creates a noticeable performance increase for data that is accessed frequently.

The L2ARC is the second level adaptive replacement cache.
The L2ARC is often called the “cache drive” in ZFS systems.
The algorithms that manage L2ARC population are automatic and intelligent.

ZIL

The ZIL is the “ZFS intent log”and acts as a logging mechanism to store synchronous writes, until they are safely written to the main data structure on the storage pool.
The speed at which data can be written to the ZIL determines the speed at which synchronous write requests can be done.
By using fast disks as the ZIL, you accelerate the ZIL and improves the synchronous write performance.
Like L2ARC, the ZIL is managed automatically and intelligently by ZFS.

SSD Drives

High performance Drives can be added to a storage pool to create a hybrid storage pool.
SSD drives can be added to a ZFS pool as “cache” drives (for the L2ARC) or DRAM drives as “log” drives (for the ZIL).

By adding drives for both the L2ARC and the ZIL, both read and write data is accelerated.

Copy On Write

ZFS applies a copy-on-write transactional model to writing data blocks.
Blocks that contain data are never overwritten, rather a new block is created.
When new data is written, all the connected metadata and block tree structure is updated.
Applying a copy-on-write model allows Snapshots to be taken.
These allows the system to revert to the state it had when they were made.
This technique also allows the creation of instant clones, without any additional space requirements.

Hybrid Data Pools

ZFS supports a combination of SSD, SAS and SATA devices in the same data pool without sacrificing data access.
This mix allows the use of high performance devices to create a tiered storage architecture for increased performance.

Variable Block Size

ZFS supports variable data block sizes for volumes and filesystems.
This makes ZFS adaptable to any application-side block size requirements for performance and space usage.

Snapshot Replication

Snapshots taken with ZFS can be incrementally replicated at a block level.
Only block level differences will be synchronised between different snapshots making on/off-site replication very efficient from a network usage standpoint.

Snapshots can be scheduled using different SLA’s.

Compression

ZFS uses in-line compression while data is being written to disk, instead of running compression afterwards.
Compressions algorithms vary from the space efficient ones like GZIP9 to performance increasing ones like LZ4.

Deduplication

ZFS provides block level in-line deduplication using cryptographically strong 256 bit checksums like SHA256.
Deduplication is done synchronously using the available CPU power, on the entire storage pool.
ZFS also provides granular deduplication thus allowing deduplication to be used on a per-dataset basis.

Clustering

ZFS solution uses a highly scalable software architecture. It provides an active-passive or active-active clustering mechanism with automatic balancing of storage pools between nodes.

Solutions integrates ALUA which allows LUNs to be seen on all storage nodes that are present within the cluster, thus creating multiple access paths. The HA architecture provides split brain fencing mechanisms including Gratuitous ARP checks and SCSI 2 reservations.

Storage pools can be actively migrated between nodes to achieve load balancing.

VMware Plugins

Volumes and filesystems residing on ZFS Solutions can be shared using ISCSI/FC, NFS and InfiniBand with the any VMware hypervisor.

Solutuons can take memory consistent snapshots of hosted VMware VMs. VMs are automatically quiesced during datastore snapshots to ensure consistency.

Additional datastore space provisioning is done automatically, and expanding a LUN will also expand its attached datastore.