FuBAR

Flexible FTL-based Address Mappings for Flash

Sriram Subramanian, Zev Weiss, Swaminathan Sundararaman, Nisha Talagala, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau

May 6, 2014

Background

Flash is growing in popularity

Fundamental physical differences (erase cycles, wear-out)
Software must adapt to hardware's characteristics

New software techniques can exploit flash in useful ways

Background

But existing software stack makes flash look just like a disk!

Standard block read/write interface over over SATA/SAS (sometimes PCIe)

Flash translation layer (FTL) manages hardware's requirements

On-device firmware or host-based software

Background: FTL responsibilities

Log-structured writes to avoid read-erase-rewrite cycles

Remap logical blocks to physical locations

Space management: garbage collection

FTL keeps validity bitmap
(Logical) overwrites invalidate old data

Looking forward

Existing software works, but isn't ideal.

What we want:

Greater efficiency
New features

New software offering different interfaces can better exploit existing hardware to achieve this.

What we're working on...

Stackable block storage layer:

FTL-like in many ways (log-structured, CoW oriented)
Presents large, sparse address space mapped to smaller underlying storage
Allows applications to explicitly manipulate address map

Naming?

My suggestion:

FuBAR

(vetoed.)

But in the absence of anything better...

FuBAR

Remains backward-compatible with traditional read/write block IO interface

Additionally offers new operations, range move and clone

Also available in vectored atomic flavors

Allows efficient implementation of new features and opportunities for improvements in existing systems

Range move: zero copy data relocation

Range-move [0,2) to [4,6):

Range move: zero copy data relocation

Range-move [0,2) to [4,6):

Clone: zero-copy data duplication

Clone [2,4) to [5,7):

Clone: zero-copy data duplication

Clone [2,4) to [5,7):

Clone: zero-copy data duplication

Clone [2,4) to [5,7):

CoW means cloned data remains at other mappings when one is overwritten.

What FuBAR enables

Volume snapshots for backups, auditing, etc.

Simply clone the entire address space of a volume
Sparse address space allows many volumes within a single device
Atomic, time- and space-efficient

What FuBAR enables

Smaller clones allow easy implementation of advanced FS features:

Zero-copy/CoW file snapshots (as in ZFS, BTRFS) easily added to conventional filesystems
FS need only allocate a block range and issue a clone operation
cp in O(1) time and space

Could also provide back-end mechanism for a data-deduplication system

What FuBAR enables

Range moves can improve efficiency of existing systems (especially with vectored atomics)

Write-ahead logging (RDBMS, journaling FS):

Conventional approach uses double-write (once to log/journal, then again to "home")
Can instead write to scratch location, then atomically move data to home location
80% TPCC improvement with MySQL
Reduced write traffic also lengthens device lifespan

Challenges

Garbage collection becomes much more complicated.

Address map is M:1, not 1:1, so bitmaps no longer work.

Metadata persistence:

Can't acknowledge a write until both data and metadata have been safely stored
Tricky to do efficiently with only block-granularity storage

fin.

Thanks!

Questions?