Table of Contents
Accessing Files
Reading from a File
Read-Ahead of Files
Writing to a File
Memory Mapping
Non-Linear Memory Mapping
Direct I/O Transfer
Asynchronous I/O
Home Database Mysql Tutorial Accessing a File (Linux Kernel)

Accessing a File (Linux Kernel)

Jun 07, 2016 pm 03:36 PM
ac file kernel linux

Accessing Files Different Ways to Access a File Canonical Mode (O_SYNC and O_DIRECT cleared) Synchronous Mode (O_SYNC flag set) Memory Mapping Mode Direct I/O Mode (O_DIRECT flag set, user space - disk) Asynchronous Mode Reading a file is

Accessing Files

Different Ways to Access a File

ð  Canonical Mode (O_SYNC and O_DIRECT cleared)

ð  Synchronous Mode (O_SYNC flag set)

ð  Memory Mapping Mode

ð  Direct I/O Mode (O_DIRECT flag set, user space disk)

ð  Asynchronous Mode

 

Reading a file is always page-based: the kernel always transfers whole pages of data at once.

Allocate a new page frame -> fill the page with suitable portion of the file -> add the page to the page cache -> copy the requested bytes to the process address space

 

Writing to a file may involve disk space allocation because the file size may increase.

 

Reading from a File

/**

 * do_generic_file_read - generic file read routine

 * @filp:  the file to read

 * @ppos:        current file position

 * @desc:        read_descriptor

 * @actor:       read method

 *

 * This is a generic file read routine, and uses the

 * mapping->a_ops->readpage() function for the actual low-level stuff.

 *

 * This is really ugly. But the goto's actually try to clarify some

 * of the logic when it comes to error handling etc.

 */

static void do_generic_file_read(struct file *filp, loff_t *ppos,

                   read_descriptor_t *desc, read_actor_t actor)

 

 

Read-Ahead of Files

Many disk accesses are sequential, that is, many adjacent sectors on disk are likely to be fetched when handling a series of process’s read requests on the same file.

Read-ahead consists of reading several adjacent pages of data of a regular file or block device file before they are actually requested. In most cases, this greatly improves the system performance, because it lets the disk controller handle fewer commands. In some cases, the kernel reduces or stops read-ahead when some random accesses to a file are performed.

 

Natural language description -> design (data structure + algo) -> code

Description:

ð  Read-ahead may be gradually increased as long as the process keeps accessing the file sequentially.

ð  Read-ahead must be scaled down when or even disabled when the current access is not sequential.

ð  Read-ahead should be stopped when the process keeps accessing the same page over and over again or when almost all the pages of the file are in the cache.

 

 

 

Design:

Current window: a contiguous portion of the file consisting of pages being requested by the process

 

Ahead window: a contiguous portion of the file following the ones in the current window

 

/*

 * Track a single file's readahead state

 */

struct file_ra_state {

       pgoff_t start;                     /* where readahead started */

       unsigned int size;              /* # of readahead pages */

       unsigned int async_size;   /* do asynchronous readahead when

                                      there are only # of pages ahead */

 

       unsigned int ra_pages;            /* Maximum readahead window */

       unsigned int mmap_miss;        /* Cache miss stat for mmap accesses */

       loff_t prev_pos;          /* Cache last read() position */

};

 

 

struct file {

       struct file_ra_state    f_ra;

}

 

When is read-ahead algorithm executed?

1.     Read pages of file data

2.     Allocate a page for a file memory mapping

3.     Readahead(), posix_fadvise(), madvise()

 

Writing to a File

Deferred write

 

Memory Mapping

ð  Shared Memory Mapping

ð  Private Memory Mapping

 

System call: mmap(), munmap(), msync()

mmap, munmap - map or unmap files or devices into memory

msync - synchronize a file with a memory map

 

The kernel offers several hooks to customize the memory mapping mechanism for every different filesystem. The core of memory mapping implementation is delegated to a file object’s method named mmap. For disk-based filesystems and for block devices, this method is implemented by a generic function called generic_file_mmap().

 

 

Memory mapping mechanism depends on the demand paging mechanism.

For reasons of efficiency, page frames are not assigned to a memory mapping right after it has been created, but at the last moment that is, when the process tries to address one of its pages, thus causing a Page Fault exception.

 

Non-Linear Memory Mapping

The  remap_file_pages()  system call is used to create a non-linear mapping, that is, a mapping in which the pages of the file are mapped into a non-sequen

       tial order in memory.  The advantage of using remap_file_pages() over using repeated calls to mmap(2) is that the former approach does not require the  ker

       nel to create additional VMA (Virtual Memory Area) data structures.

 

       To create a non-linear mapping we perform the following steps:

 

       1. Use mmap(2) to create a mapping (which is initially linear).  This mapping must be created with the MAP_SHARED flag.

 

       2. Use  one  or more calls to remap_file_pages() to rearrange the correspondence between the pages of the mapping and the pages of the file.  It is possible

          to map the same page of a file into multiple locations within the mapped region.

 

 

Direct I/O Transfer

There’s no substantial difference between:

1.     Accessing a regular file through filesystem

2.     Accessing it by referencing its blocks on the underlying block device file

3.     Establish a file memory mapping

 

However, some highly-sophisticated programs (self-caching application such as high-performance server) would like to have full control of the I/O data transfer mechanism.

 

Linux offers a simple way to bypass the page cache: direct I/O transfer.

O_DIRECT

 

Generic_file_direct_IO() -> __block_dev_direct_IO(), it does not return until all direct IO data transfers have been completed.

 

 

Asynchronous I/O

“Asynchronous” essentially means that when a User Mode process invokes a library function to read or write a file, the function terminates as soon as the read or write operation has been enqueued, possibly even before the real I/O data transfer takes place. The calling process thus continue its execution while the data is being transferred.

 

aio_read(3), aio_cancel(3), aio_error(3), aio_fsync(3), aio_return(3), aio_suspend(3), aio_write(3)

 

Asynchronous I/O Implementation

ð  User-level Implementation

ð  Kernel-level Implementation

 

User-level Implementation:

Clone the current process -> the child process issues synchronous I/O requests -> aio_xxx terminates in parent process

 

io_setup(2), io_cancel(2), io_destroy(2), io_getevents(2), io_submit(2)

 

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Android TV Box gets unofficial Ubuntu 24.04 upgrade Android TV Box gets unofficial Ubuntu 24.04 upgrade Sep 05, 2024 am 06:33 AM

For many users, hacking an Android TV box sounds daunting. However, developer Murray R. Van Luyn faced the challenge of looking for suitable alternatives to the Raspberry Pi during the Broadcom chip shortage. His collaborative efforts with the Armbia

deepseek web version entrance deepseek official website entrance deepseek web version entrance deepseek official website entrance Feb 19, 2025 pm 04:54 PM

DeepSeek is a powerful intelligent search and analysis tool that provides two access methods: web version and official website. The web version is convenient and efficient, and can be used without installation; the official website provides comprehensive product information, download resources and support services. Whether individuals or corporate users, they can easily obtain and analyze massive data through DeepSeek to improve work efficiency, assist decision-making and promote innovation.

How to install deepseek How to install deepseek Feb 19, 2025 pm 05:48 PM

There are many ways to install DeepSeek, including: compile from source (for experienced developers) using precompiled packages (for Windows users) using Docker containers (for most convenient, no need to worry about compatibility) No matter which method you choose, Please read the official documents carefully and prepare them fully to avoid unnecessary trouble.

BitPie Bitpie wallet app download address BitPie Bitpie wallet app download address Sep 10, 2024 pm 12:10 PM

How to download BitPie Bitpie Wallet App? The steps are as follows: Search for "BitPie Bitpie Wallet" in the AppStore (Apple devices) or Google Play Store (Android devices). Click the "Get" or "Install" button to download the app. For the computer version, visit the official BitPie wallet website and download the corresponding software package.

BITGet official website installation (2025 beginner's guide) BITGet official website installation (2025 beginner's guide) Feb 21, 2025 pm 08:42 PM

BITGet is a cryptocurrency exchange that provides a variety of trading services including spot trading, contract trading and derivatives. Founded in 2018, the exchange is headquartered in Singapore and is committed to providing users with a safe and reliable trading platform. BITGet offers a variety of trading pairs, including BTC/USDT, ETH/USDT and XRP/USDT. Additionally, the exchange has a reputation for security and liquidity and offers a variety of features such as premium order types, leveraged trading and 24/7 customer support.

Detailed explanation: Shell script variable judgment parameter command Detailed explanation: Shell script variable judgment parameter command Sep 02, 2024 pm 03:25 PM

The system variable $n is the parameter passed to the script or function. n is a number indicating the number of parameters. For example, the first parameter is $1, and the second parameter is $2$? The exit status of the previous command, or the return value of the function. Returns 0 on success, 1 on failure $#Number of parameters passed to the script or function $* All these parameters are enclosed in double quotes. If a script receives two parameters, $* is equal to $1$2$0The name of the command being executed. For shell scripts, this is the path to the activated command. When $@ is enclosed in double quotes (""), it is slightly different from $*. If a script receives two parameters, $@ is equivalent to $1$2$$the process number of the current shell. For a shell script, this is the process I when it is executing

Zabbix 3.4 Source code compilation installation Zabbix 3.4 Source code compilation installation Sep 04, 2024 am 07:32 AM

1. Installation environment (Hyper-V virtual machine): $hostnamectlStatichostname:localhost.localdomainIconname:computer-vmChassis:vmMachineID:renwoles1d8743989a40cb81db696400BootID:renwoles272f4aa59935dcdd0d456501Virtualization:microsoftOperatingSystem:CentOS Linux7(Core)CPEOSName:cpe:

Ouyi okx installation package is directly included Ouyi okx installation package is directly included Feb 21, 2025 pm 08:00 PM

Ouyi OKX, the world's leading digital asset exchange, has now launched an official installation package to provide a safe and convenient trading experience. The OKX installation package of Ouyi does not need to be accessed through a browser. It can directly install independent applications on the device, creating a stable and efficient trading platform for users. The installation process is simple and easy to understand. Users only need to download the latest version of the installation package and follow the prompts to complete the installation step by step.

See all articles