The Design and Implementation of a Log-structured File System Review
The Design and Implementation of a Log-Structured File System
'The Blueprint and Implementation of a Log-Structured File System' past Rosenblum and Ousterhout was published in ACM Transaction on Computer Systems in 1992. The paper introduced the log-based storage which doubles down on the concept of leveraging faster sequential accesses on disks. Fifty-fifty though the log structured file system (LFS) was designed for hard disk drives (HDDs) in the 1990s, its ideas around log-structuring and cleaning have been influential in the pattern of mod storage systems including file systems, central-value stores, and the flash translation layers (FTLs) of solid state disks (SSDs). We will discuss the motivation and key design aspects of LFS, as described in the paper, while besides borrowing from OSTEP volume.
The authors of LFS note that the fast file arrangement (FFS) was not fast enough! Despite its disk-enlightened layout centered around cylinder groups, FFS tends to spread data effectually the disk. As a concrete instance of this trouble, consider the number of accesses required for creating a new small file with FFS. FFS needs to update the file data, the file metadata (file inode), the data for the directory containing the file, this directory's metadata (directory inode), and the bitmaps that maintain which data and inode blocks are costless. All of these structures are stored separately on the deejay. To make matters worse, FFS performs these metadata updates synchronously on the disk as opposed to buffering them in primary memory and performing them in the background. Considering of this, metadata heavy workloads, east.yard., workloads that create a lot of small files, become bottlenecked on the deejay seeks.
The authors of LFS also observed a technology trend: systems had increasingly large main memory sizes. This trend unsaid that more and more than of the data that applications desire to read would be cached in the main memory, leading to reduced read traffic to the disk. Thus the write traffic would dominate disk accesses (writes take to exist performed because they cannot be cached in chief memory forever). This observation motivated the authors to optimize LFS writes rather than reads.
The bones idea of LFS is fairly straightforward: always write to disk in big sequential chunks, termed segments, because sequential writes are faster than random writes on HDDs. In order to practise so, LFS buffers blocks in main retention for long-enough to accrue enough information to fill an entire segment. LFS puts these buffered blocks sequentially in a segment and writes the segment to disk. The log is the one and just source of data in LFS. This design of always writing to the log, and the log being the one and only source of truth also extends to directories, which are just special files whose information consists of a list of files and their inodes.
LFS'south seemingly simple design poses the claiming of finding data in the log. LFS information updates are out-of-place, i.e., when a file's data is updated, the data is written at a new location (in a cake in the new segment) equally opposed to the location that held the former data. LFS needs to exist able to find the about up-to-date data for whatever given file block. LFS uses file inodes, similar to FFS, to shop the location of data blocks (along with other file metadata like its size and update fourth dimension). In FFS, file inodes were stored at stock-still locations on the disk. LFS could use fixed location inodes, just because each file write requires a write to its inode (call back that a file write changes the location of its data cake), LFS would need to update these inode that are not in the log and abandon its e'er-write-sequentially rule. Then LFS stores the inodes in the log as well. This, of class, introduces the problem of how to notice the inodes. LFS introduces inode maps to address this problem. Inode maps store the location of an inode indexed by the inode number. This only shifts the problem though — each inode write (triggered past each file write) triggers a inode map write. To ensure that even the inode map writes are sequential, LFS writes the inode map to the log equally well (surprise, surprise!) But and so, once more, how does LFS discover the inode map in the log. LFS stores the location of the inode map blocks at a fixed location on the disk (finally!), chosen the checkpoint region. However, the checkpoint region is written to lazily (east.g., once every xxx seconds) as opposed to for every update to the inode map.
The introduction of inode maps is a classic example of indirection, a common approach for solving issues in estimator systems. LFS could have kept the inode maps at fixed location and updated them lazily or even kept the inodes at a fixed location and updated them lazily. All the same, to exist able to update some data lazily, it must be kept buffered in the main memory. By introducing the levels of indirection, LFS reduces the size of data that needs to be cached (one cake of checkpoint region can store the location of multiple blocks of the inode map, and each block of the inode map can shop the location of multiple inodes)
LFS's design is clearly optimized for the write-path and has the baked in assumption that oft read data could be cached in main memory. Indeed, if LFS were to read some data from disk, in the worst case it would require multiple random accesses — first to the checkpoint region, and so to the inode map, then to the inode, and and then to the (multiple) data blocks, all of which could exist spread across the log on the disk. In practice, LFS is able to cache all of the inode map blocks, so information technology but has to randomly admission the inode and data blocks from the disk in a typical scenario.
LFS's out-of-place updates too necessitate garbage collection of the one-time data to ensure that the disk does non run out of space. This is reminiscent of the garbage collection problem of SSDs — however, LFS pre-dates SSDs and the garbage drove policies of SSD FTLs are indeed inspired by LFS. LFS garbage collects information by reading the valid information from segments and writing them to new consolidated segments — this process is called segment cleaning. The cost of segment cleaning depends on the utilization of the segments (fraction of alive data to total data in the segment) — segments with higher utilization require more cleaning piece of work.
LFS authors showtime experimented with the seemingly obvious selection of a greedy cleaning policy merely constitute that it is suboptimal for workloads with skewed admission patterns that are commonly found in filesystems. File organization accesses are ofttimes skewed such that a small fraction of files are accessed at a unduly college frequency (eastward.g., ten% for files are accessed xc% of the time). The files that are updated frequently are said to contain hot data, and files that are updated rarely are said to comprise cold data. A greedy cleaning policy, i.e., 1 which always selects segments with the everyman utilization, keeps selecting the segments with hot information. This has ii downsides. Get-go, because the information is hot, their segments crave frequent cleaning (the hot data keeps getting outdated soon later cleaning). Information technology would be worthwhile for LFS to await for longer before cleaning the segments with hot information to amortize the cleaning cost beyond more updates to the hot information. 2nd, for segments that contain cold information, it is valuable to make clean and consolidate them fifty-fifty if their utilization is depression. Because this information is cold, once their segments have been cleaned and consolidated (into, say, a single segment of cold data), the cold segment is unlikely to be updated in virtually future and require whatever more cleaning work.
LFS uses a toll-benefit cleaning policy. It balances the cost for cleaning as determined by the utilization with the benefit of the cleaning in terms of the longevity of the cleaned segments. Longevity is the time for which the cleaned data will remain unchanged (and thus, not require cleaning). Of course, longevity is difficult to predict because it depends on time to come accesses. LFS makes the simplifying supposition (a mutual assumption in computer systems) that history is indicative of the futurity, substituting longevity (how long will a information block live) with age (how long has a data cake lived). Segments with cold data are considered to take higher longevity and can thus be cleaned at a relatively college utilization. LFS tracks the utilization of a segment and its age (in terms of the time of the most recent modification to whatever block in the segment) in a segment usage tabular array.
The LFS paper describes many other nitty-gritty details that become towards making the seemingly simple design choice of LFS piece of work. The paper likewise discusses crash recovery, an important aspect of file systems, in detail. We will discuss crash recovery in a future mail service wherein we will compare and contrast some of the classical crash recovery mechanisms including that of LFS. Then long, and thanks for reading!
Source: https://afterhoursacademic.medium.com/the-design-and-implementation-of-a-log-structured-file-system-20e1509a9ce1
Post a Comment for "The Design and Implementation of a Log-structured File System Review"