Home > Web Front-end > JS Tutorial > How does the internet work? Part 1

How does the internet work? Part 1

Linda Hamilton
Release: 2024-10-05 22:18:02
Original
270 people have browsed it

Ever wonder what happens when you click a link? ? How The Internet Works takes you behind the scenes of the digital world, breaking down complex tech into simple, bite-sized insights. From data packets to servers and beyond, discover the magic that powers your online experience! (Hook written with the help of AI, because I can't :D)

What happens when you go to google.com?

The "g" key is pressed

Let me explain the physical keyboard actions and the OS interrupts. When you press the "g" key, the browser registers the event, triggering the auto-complete functions. Based on your browser's algorithm and whether you're in regular or private/incognito mode, various suggestions appear in a dropdown beneath the URL bar.

These suggestions are typically prioritized and sorted using factors such as your search history, bookmarks, cookies, and popular internet searches. As you continue typing "google.com," numerous processes run in the background, and the suggestions refine with each keystroke. The browser might even predict "google.com" before you've finished typing.

How does the internet work? Part 1
Browsing Autocomplete Sequences

The "enter" key bottoms out

To establish a starting point, let's consider the Enter key on a keyboard when it reaches the bottom of its travel range. At this moment, an electrical circuit dedicated to the Enter key is closed (either mechanically or capacitively), allowing a small current to flow into the keyboard's logic circuitry. This circuitry scans the state of each key switch, filters out electrical noise from the rapid closure of the switch (debouncing), and translates the action into a keycode—in this case, the integer 13. The keyboard controller then encodes this keycode for transmission to the computer. Today, this is almost always done over a Universal Serial Bus (USB) or Bluetooth connection, though older systems used PS/2 or ADB.

In the case of a USB keyboard:

  • The keyboard is powered by a 5V supply delivered through pin 1 of the computer's USB host controller.
  • The keycode generated by the keypress is stored in an internal register known as the "endpoint."
  • The USB host controller polls this "endpoint" roughly every 10ms (the minimum interval set by the keyboard), retrieving the stored keycode.
  • The keycode is sent to the USB Serial Interface Engine (SIE), where it is converted into one or more USB packets in accordance with the USB protocol.
  • These packets are transmitted over the D and D- lines (the two middle pins) at a maximum rate of 1.5 Mb/s, as the keyboard is classified as a "low-speed device" (per USB 2.0 standards).
  • The computer's host USB controller decodes this serial signal, and the Human Interface Device (HID) driver interprets the keypress. Finally, the key event is passed to the operating system's hardware abstraction layer. How does the internet work? Part 1 Sequence Diagram

In the case of a virtual keyboard (such as on touch screen devices):

  • When the user touches a capacitive touch screen, a small amount of current transfers to their finger. This interaction disturbs the electrostatic field of the screen’s conductive layer, creating a voltage drop at the point of contact.
  • The screen controller detects this and triggers an interrupt, reporting the coordinates of the touch.
  • The operating system then alerts the currently active application that a press event has occurred within its graphical interface, typically on a virtual keyboard button.
  • The virtual keyboard application raises a software interrupt, which notifies the operating system of a "key pressed" event.
  • The focused application receives this notification and processes the keypress accordingly. How does the internet work? Part 1 Sequence Diagram Describing the same

Interrupt Fires [Not for USB Keyboards]

For non-USB keyboards, such as those using legacy connections (e.g., PS/2), the keyboard signals an interrupt via its interrupt request line (IRQ). This IRQ is mapped to an interrupt vector (an integer) by the system's interrupt controller. The CPU consults the Interrupt Descriptor Table (IDT), which links each interrupt vector to a corresponding function known as an interrupt handler, supplied by the operating system’s kernel.

When the interrupt is triggered, the CPU uses the interrupt vector to index into the IDT and execute the appropriate interrupt handler. This process causes the CPU to transition into kernel mode, allowing the operating system to manage the keypress event.

A WM_KEYDOWN Message is Sent to the App (On Windows)

When the Enter key is pressed, the Human Interface Device (HID) transport passes the key down event to the KBDHID.sys driver, which converts the HID usage data into a scan code. In this case, the scan code is VK_RETURN (0x0D), representing the Enter key. The KBDHID.sys driver then communicates with the KBDCLASS.sys driver (the keyboard class driver), which securely manages all keyboard input. Before proceeding, the signal may pass through any third-party keyboard filters installed on the system, though this also happens in kernel mode.

Next, Win32K.sys comes into play, determining which window is currently active by invoking the GetForegroundWindow() API. This function retrieves the window handle (hWnd) of the active application, such as the browser’s address bar. At this point, the Windows "message pump" calls SendMessage(hWnd, WM_KEYDOWN, VK_RETURN, lParam). The lParam parameter contains a bitmask that provides additional information about the keypress, including:

  • Repeat count (which is 0 in this case),
  • Scan code (which might be OEM-specific but typically standard for VK_RETURN),
  • Extended key flags (indicating whether modifier keys like Alt, Shift, or Ctrl were also pressed, which they weren’t).

The SendMessage API queues the message for the specific window handle. Later, the system’s main message processing function (known as WindowProc) assigned to the window (hWnd) retrieves and processes messages in the queue.

The active window in this case is an edit control, and its WindowProc function has a message handler that responds to WM_KEYDOWN events. The handler checks the third parameter (wParam) passed by SendMessage, recognizes that the value is VK_RETURN, and thus determines that the user has pressed the Enter key. This triggers the appropriate response for the application.

A KeyDown NSEvent is Sent to the App (On OS X)

When a key is pressed on OS X, the interrupt signal triggers an event in the I/O Kit keyboard driver (a kernel extension or "kext"). This driver translates the hardware signal into a key code. The key code is then passed to the WindowServer, which manages the graphical user interface.

The WindowServer dispatches the key press event to the appropriate applications (such as the active or listening ones) by sending it through their Mach port, where it is placed into an event queue. Applications with the proper privileges can access this event queue by calling the mach_ipc_dispatch function.

Most applications handle this process through the NSApplication main event loop, which is responsible for processing user input. When the event is a key press, it is represented as an NSEvent of type NSEventTypeKeyDown. The application then reads this event and responds accordingly, triggering any code related to keypress actions based on the key code received.

The Xorg Server Listens for Keycodes (On GNU/Linux)

When a key is pressed in a graphical environment using the X server, the X server employs the evdev (event device) driver to capture the keypress event. The keycode from the physical keyboard is then re-mapped into a scancode using X server-specific keymaps and rules.

Once the mapping is complete, the X server forwards the resulting scancode to the window manager (such as DWM, Metacity, i3, etc.). The window manager, in turn, sends the character or key event to the currently focused window. The graphical API of the focused window processes this event and displays the corresponding symbol in the appropriate field, using the correct font, based on the key pressed.

This flow ensures that the character is correctly rendered in the active application’s interface, completing the keypress interaction from hardware to graphical output.

Parse URL

When the browser parses the URL(Uniform Resource Locator), it extracts the following components:

  • Protocol: "http" The browser understands that this uses the Hyper Text Transfer Protocol to communicate with the server.
  • Resource: "/" This indicates that the browser should retrieve the main (index) page of the website, as the / path typically refers to the root or home page of the server.

Each of these components helps the browser interpret and fetch the desired resource from the web.

How does the internet work? Part 1

Is it a URL or a Search Term?

When no protocol (e.g., "http") or valid domain name is provided, the browser interprets the text in the address bar as a potential search term. Instead of trying to resolve it as a URL, the browser forwards the text to its default web search engine.

In most cases, the browser appends a special identifier to the search query, indicating that the request originated from the browser's URL bar. This allows the search engine to handle and prioritize these searches accordingly, improving the relevance of the results based on the context.

This process helps the browser determine whether it should attempt to navigate directly to a website or provide search results based on the entered text.

Convert Non-ASCII Unicode Characters in the Hostname

  • The browser examines the hostname for any characters that fall outside the ASCII range, specifically those that are not in the sets of a-z, A-Z, 0-9, -, or ..
  • In this case, the hostname is google.com, which contains only ASCII characters, so no conversion is necessary. However, if there were non-ASCII characters present in the hostname, the browser would apply Punycode encoding to convert the hostname into a valid ASCII representation. This process ensures that all characters in the hostname can be correctly processed by the network protocols.

Check HSTS List

The browser first checks its preloaded HSTS (HTTP Strict Transport Security) list, which contains websites that have explicitly requested to be accessed only via HTTPS.

If the requested website is found on this list, the browser automatically sends the request using HTTPS rather than HTTP. If the website is not in the HSTS list, the initial request is sent via HTTP.

It’s important to note that a website can still implement HSTS without being included in the preloaded list. In such cases, the first HTTP request made by the user will return a response instructing the browser to only send subsequent requests via HTTPS. However, this initial HTTP request could expose the user to a downgrade attack, where an attacker might intercept the request and force it to remain unencrypted. This vulnerability is why modern web browsers include the HSTS list, enhancing security for users by preventing insecure connections from being established in the first place.

DNS Lookup

The browser begins the DNS lookup process by checking if the domain is already present in its cache. (To view the DNS cache in Google Chrome, navigate to chrome://net-internals/#dns.)

If the domain is not found in the cache, the browser calls the gethostbyname library function (the specific function may vary depending on the operating system) to perform the hostname resolution.

  1. Local Hosts File Check:

    • The gethostbyname function first checks if the hostname can be resolved by referencing the local hosts file, whose location varies by operating system. This file is a simple text file that maps hostnames to IP addresses and can provide a quick resolution without querying DNS.
  2. DNS Server Request:

    • If the hostname is not cached and cannot be found in the hosts file, the browser then sends a request to the DNS server configured in the network stack. This server is typically the local router or the ISP's caching DNS server, which stores previously resolved names to speed up future requests.
  3. ARP Process for DNS Server:

    • If the DNS server is on the same subnet, the network library follows the ARP (Address Resolution Protocol) process to resolve the IP address of the DNS server, ensuring that the request is directed correctly within the local network.
    • If the DNS server is on a different subnet, the network library instead follows the ARP process for the default gateway IP, which acts as an intermediary to route the request to the appropriate subnet.

This systematic approach ensures that the browser efficiently resolves domain names to IP addresses, enabling it to establish a connection to the desired website. By checking the cache first, using the local hosts file, and finally querying the DNS server, the browser minimizes the time spent on hostname resolution.

How does the internet work? Part 1

Sequence Diagram

ARP Process

In order to send an ARP (Address Resolution Protocol) broadcast, the network stack library needs two key pieces of information: the target IP address that needs to be looked up and the MAC address of the interface that will be used to send out the ARP broadcast.

Checking the ARP Cache:

The ARP cache is first checked for an entry corresponding to the target IP address. If an entry exists, the library function returns the result in the format:
Target IP = MAC.

If the Entry is Not in the ARP Cache:

If there is no entry for the target IP address, the following steps are taken:

  • The route table is consulted to determine whether the target IP address is on any of the subnets listed in the local route table.
    • If it is found, the library uses the interface associated with that subnet.
    • If not, the library defaults to using the interface that connects to the default gateway.
  • The MAC address of the selected network interface is then retrieved.

    Sending the ARP Request:

The network library constructs and sends a Layer 2 (data link layer of the OSI model) ARP request with the following format: ARP Request:

  • Sender MAC: interface:mac:address:here
  • Sender IP: interface.ip.goes.here
  • Target MAC: FF:FF:FF:FF:FF:FF (Broadcast)
  • Target IP: target.ip.goes.here

Depending on the hardware setup between the computer and the router, the behavior of the ARP request varies:

Directly Connected:

If the computer is directly connected to the router, the router will respond with an ARP Reply (see below).

Hub:

If the computer is connected to a hub, the hub will broadcast the ARP request out of all its other ports. If the router is connected to the same "wire," it will respond with an ARP Reply (see below).

Switch:

If the computer is connected to a switch, the switch will check its local CAM/MAC table to identify which port has the MAC address being queried. If the switch has no entry for the MAC address, it will rebroadcast the ARP request to all other ports. If the switch does have an entry in its MAC/CAM table, it will send the ARP request only to the port that has the corresponding MAC address.

  • If the router is on the same "wire," it will respond with an ARP Reply (see below).

ARP Reply:

The ARP reply will have the following format:

Sender MAC: target:mac:address:here

Sender IP: target.ip.goes.here

Target MAC: interface:mac:address:here

Target IP: interface.ip.goes.here

Now that the network library has obtained the IP address of either the DNS server or the default gateway, it can resume its DNS process:

  1. The DNS client establishes a socket connection to UDP port 53 on the DNS server, utilizing a source port above 1023.
  2. If the response size exceeds the UDP limit, TCP will be used instead to accommodate the larger response.
  3. If the local or ISP DNS server does not have the requested information, it will initiate a recursive search, querying a hierarchy of DNS servers until the SOA (Start of Authority) is reached, at which point the answer is returned.

Opening of a Socket

Once the browser receives the IP address of the destination server, it combines this with the port number specified in the URL (where HTTP defaults to port 80 and HTTPS to port 443). The browser then makes a call to the system library function named socket, requesting a TCP socket stream using AF_INET or AF_INET6 and SOCK_STREAM.

传输层处理:

  • 该请求首先由传输层处理,其中生成 TCP 段。目标端口被添加到标头中,源端口是从内核的动态端口范围内选择的(在 Linux 中由 ip_local_port_range 指定)。

网络层处理:

  • 该段随后被发送到网络层,网络层将其包装在一个附加的 IP 标头中。目的服务器和当前机器的IP地址被插入形成一个数据包。

链路层处理:

  • 数据包接下来到达链路层,在链路层添加帧头。该标头包括机器 NIC(网络接口卡)的 MAC 地址以及网关(本地路由器)的 MAC 地址。如果内核不知道网关的 MAC 地址,则必须广播 ARP 查询才能找到它。

此时,数据包已准备好通过以下方法之一进行传输:

  • 以太网
  • 无线网络
  • 蜂窝数据网络

对于大多数家庭或小型企业互联网连接,数据包将从您的计算机传递,可能通过本地网络,然后通过调制解调器(调制器/解调器)。该调制解调器将数字 1 和 0 转换为适合通过电话、电缆或无线电话连接传输的模拟信号。在连接的另一端,另一个调制解调器将模拟信号转换回数字数据,以供下一个网络节点处理,其中将进一步分析起始地址和目标地址。

相比之下,较大的企业和一些较新的住宅连接将使用光纤或直接以太网连接,从而使数据保持数字化并直接传递到下一个网络节点进行处理。

最终,数据包将到达管理本地子网的路由器。从那里,它将继续前往自治系统 (AS) 的边界路由器,遍历其他 AS,最后到达目标服务器。沿途的每个路由器从 IP 标头中提取目标地址,并将其路由到适当的下一跳。对于每个处理它的路由器,IP 标头中的生存时间 (TTL) 字段会减一。如果 TTL 字段达到零或者当前路由器队列中没有空间(这可能是由于网络拥塞而发生),数据包将被丢弃。
此发送和接收过程按照 TCP 连接流程发生多次:

  1. 客户端选择一个初始序列号(ISN)并向服务器发送一个数据包,其中设置了 SYN 位以指示它正在设置 ISN。
  2. 服务器接收SYN,如果同意,则执行以下操作:
    • 选择自己的初始序列号。
    • 设置 SYN 位以指示它正在选择其 ISN。
  3. 将(客户端 ISN 1)复制到其 ACK 字段并添加 ACK 标志以指示它正在确认收到第一个数据包。

  4. 客户端通过发送以下数据包来确认连接:

    • 增加自己的序列号。
    • 增加接收者确认数量。
    • 设置 ACK 字段。
  5. 数据传输:数据传输如下:

    • 当一侧发送 N 个数据字节时,它会将其序列号 (SEQ) 增加该数字。
    • 当另一方确认收到该数据包(或一串数据包)时,它会发送一个 ACK​​ 数据包,其确认 (ACK) 值等于上次从另一方收到的序列。
  6. 关闭连接:关闭连接:

    • 发起关闭的一方发送FIN数据包。
    • 对方确认FIN数据包并发送自己的FIN。
    • 发起方通过 ACK 确认对方的 FIN。

How does the internet work? Part 1

打开套接字:序列图

TLS 握手

  • 客户端计算机向服务器发送一条 ClientHello 消息,其中包括其传输层安全 (TLS) 版本、可用密码算法列表和压缩方法。
  • 作为响应,服务器回复一条 ServerHello 消息,该消息指定 TLS 版本、所选密码、所选压缩方法以及由证书颁发机构 (CA) 签名的服务器公共证书。该证书包含一个公钥,客户端将使用该公钥来加密握手的其余部分,直到就对称密钥达成一致。
  • 客户端根据其受信任的 CA 列表验证服务器的数字证书。如果可以基于 CA 建立信任,则客户端会生成一串伪随机字节,并使用服务器的公钥对该字符串进行加密。这些随机字节将用于确定对称密钥。
  • 服务器使用其私钥解密随机字节,并利用这些字节生成自己的对称主密钥副本。
  • 客户端向服务器发送一条 Finished 消息,并使用对称密钥对迄今为止发生的传输的哈希值进行加密。
  • 服务器生成自己的哈希值,然后解密客户端发送的哈希值以验证其是否匹配。如果哈希值匹配,服务器会将自己的 Finished 消息发送回客户端,该消息也使用对称密钥加密。
  • 从现在开始,TLS 会话将传输使用商定的对称密钥加密的应用程序 (HTTP) 数据。

此握手过程在客户端和服务器之间建立安全连接,确保通过连接传输的数据不被窃听和篡改。

如果数据包丢失

有时,由于网络拥塞或不稳定的硬件连接,TLS 数据包可能会在到达最终目的地之前被丢弃。在这种情况下,发送者必须决定如何反应。管理此响应的算法称为 TCP 拥塞控制。具体实现可能因发送者而异,最常见的算法是较新操作系统上的 Cubic 和许多其他操作系统上的 New Reno。

  • 客户端根据连接的最大段大小(MSS)选择拥塞窗口。
  • 对于每个确认的数据包,拥塞窗口的大小都会加倍,直到达到“慢启动阈值”。在某些实现中,此阈值是自适应的,可以根据网络条件进行更改。
  • 一旦达到慢启动阈值,对于每个确认的数据包,窗口都会增加。如果一个数据包被丢弃,窗口会呈指数减小,直到另一个数据包被确认。

这种拥塞控制机制有助于优化网络性能和稳定性,确保数据能够高效传输,同时最大限度地减少丢包的影响。

HTTP协议

如果使用的网络浏览器是由 Google 开发的,它可能会尝试与服务器协商从 HTTP 到 SPDY 协议的“升级”,而不是发送标准 HTTP 请求来检索页面。

如果客户端使用的是HTTP协议且不支持SPDY,则会按照以下格式向服务器发送请求:


GET / HTTP/1.1
Host: google.com
Connection: close
[other headers]


Copy after login

这里,[其他标头]指的是一系列以冒号分隔的键值对,这些键值对按照 HTTP 规范格式化,并以单个换行符分隔。这假设 Web 浏览器不存在违反 HTTP 规范的错误,并且它正在使用 HTTP/1.1。如果它使用不同的版本,例如 HTTP/1.0HTTP/0.9,它可能不会在请求中包含 Host 标头。

HTTP/1.1 为发送方定义了“关闭”连接选项,以表明响应完成后将关闭连接。例如:


Connection: close



Copy after login

不支持持久连接的 HTTP/1.1 应用程序必须在每条消息中包含“关闭”连接选项。

发送请求和标头后,Web 浏览器会向服务器发送一个空白换行符,表示请求内容已完成。

服务器随后使用表示请求状态的响应代码进行响应,其结构如下:


200 OK
[response headers]


Copy after login

后面跟着一个换行符,然后是包含 www.google.com 的 HTML 内容的有效负载。服务器可以关闭连接,或者,如果客户端发送的标头请求,则保持连接打开以便在进一步的请求中重用。

If the HTTP headers sent by the web browser contained sufficient information for the web server to determine whether the version of the file cached by the web browser has been unmodified since the last retrieval (for example, if the web browser included an ETagheader), the server may instead respond with:


304 Not Modified
[response headers]


Copy after login

This response will have no payload, and the web browser will retrieve the HTML from its cache.

After parsing the HTML, the web browser (and server) repeats this process for every resource (image, CSS, favicon.ico, etc.) referenced in the HTML page. In these cases, instead of GET / HTTP/1.1, the request will be structured as:


GET /$(URL relative to www.google.com) HTTP/1.1



Copy after login

If the HTML references a resource on a different domain than www.google.com, the web browser returns to the steps involved in resolving the other domain, following all steps up to this point for that domain. The Host header in the request will be set to the appropriate server name instead of google.com.

HTTP Server Request Handling

The HTTPD (HTTP Daemon) server is responsible for handling requests and responses on the server side. The most common HTTPD servers include Apache and Nginx for Linux, as well as IIS for Windows.

  1. Receiving the Request: The HTTPD server receives the incoming request from the client.
  2. Breaking Down the Request: The server analyzes the request and extracts the following parameters:
    • HTTP Request Method: This could be one of several methods, including GET, HEAD, POST, PUT, PATCH, DELETE, CONNECT, OPTIONS, or TRACE. In the case of a URL entered directly into the address bar, the method will typically be GET.
    • Domain: In this case, the domain is google.com.
    • Requested Path/Page: Here, the requested path is /, indicating that no specific page was requested; thus, / is treated as the default path.
  3. Verifying the Virtual Host: The server checks whether a Virtual Host is configured for google.com.
  4. Method Verification: The server verifies that google.com can accept GET requests.
  5. Client Permission Check: The server checks if the client is allowed to use this method based on criteria such as IP address, authentication, etc.
  6. Request Rewriting: If the server has a rewrite module installed (such as mod_rewrite for Apache or URL Rewrite for IIS), it attempts to match the request against any configured rules. If a matching rule is found, the server rewrites the request according to that rule.
  7. Content Retrieval: The server retrieves the content that corresponds to the request. In this case, it will typically default to the index file since the request path is /. While there are cases that can override this behavior, using the index file is the most common method.
  8. File Parsing and Processing: The server parses the index file according to the designated handler. If Google is using PHP, for example, the server will utilize PHP to interpret the index file and stream the output back to the client.

By following these steps, the HTTPD server efficiently processes incoming requests and returns the appropriate responses to the client.

Browser

The primary functionality of a browser is to present the web resources you choose by requesting them from a server and displaying them in the browser window. The resource is typically an HTML document but may also include PDFs, images, or other types of content. The location of the resource is specified by the user using a URI (Uniform Resource Identifier).

The way a browser interprets and displays HTML files is defined by the HTML and CSS specifications, which are maintained by the W3C (World Wide Web Consortium), the standards organization for the web.

Browser user interfaces share many common features, including:

  • An address bar for entering a URI
  • Back and forward buttons for navigation
  • Bookmarking options for saving favorite pages
  • Refresh and stop buttons for refreshing or halting the loading of current documents
  • A home button that takes you to your home page

Browser High-Level Structure

The components of a browser can be broken down as follows:

  • 用户界面:这包括地址栏、后退/前进按钮、书签菜单以及浏览器显示的任何其他部分(显示请求页面的窗口除外)。
  • 浏览器引擎:浏览器引擎充当用户界面和渲染引擎之间的桥梁,管理操作和交互。
  • 渲染引擎: 负责显示请求的内容,渲染引擎解析 HTML 和 CSS,将解析后的内容转换为屏幕上的视觉表示。
  • 网络:该组件处理网络调用,例如 HTTP 请求,并利用为各种平台定制的不同实现,同时提供独立于平台的接口。
  • UI 后端: UI 后端负责绘制基本的小部件,如组合框和窗口。它公开了一个不特定于任何平台并依赖于操作系统的用户界面方法的通用接口。
  • JavaScript 引擎: 该引擎解析并执行 JavaScript 代码,允许网页内的动态内容和交互性。
  • 数据存储:这充当持久层,使浏览器能够在本地保存各种类型的数据,例如cookie。浏览器还支持 localStorage、IndexedDB、WebSQL 和 FileSystem 等存储机制。

每个组件协同工作以创建无缝的浏览体验,使用户能够高效地访问网络资源并与之交互。

HTML解析

渲染引擎开始从网络层检索所请求文档的内容,通常以 8 kB 块的形式检索。 HTML 解析器的主要职责是将 HTML 标记转换为称为解析树的结构化表示。

输出树,称为“解析树”,由 DOM(文档对象模型)元素和属性节点的层次结构组成。 DOM 用作 HTML 文档的对象表示,为 HTML 元素提供与外部脚本(例如 JavaScript)交互的接口。这棵树的根是“Document”对象,在任何脚本操作之前,DOM 与原始标记保持几乎一一对应。

解析算法

由于多种因素,使用传统的自上而下或自下而上的解析器无法有效地解析 HTML:

  • 语言的宽容本质: HTML 的设计对语法错误比较宽容,即使标记结构不完美,浏览器也可以显示内容。
  • 浏览器容错能力:浏览器旨在处理无效 HTML 的常见情况,确保用户获得功能体验。
  • 解析过程的重入:在其他编程语言中,解析过程中源保持不变。然而,在 HTML 中,动态元素(例如包含 document.write() 调用的 <script> 标签)可以在解析期间修改输入,这需要不同的方法。 由于这些挑战,浏览器采用了专为 HTML 定制的自定义解析器。解析算法在 HTML5 规范中有详细描述,由两个主要阶段组成:标记化和树构建。</script>

解析完成时的操作

解析完成后,浏览器将继续获取链接到页面的外部资源,例如 CSS 样式表、图像和 JavaScript 文件。此时,浏览器将文档标记为交互式,并开始解析处于“延迟”模式的脚本,这意味着这些脚本将在文档完全解析后执行。然后文档状态设置为“完成”,并触发“加载”事件。

重要的是,浏览器不会为 HTML 页面生成“无效语法”错误。相反,它们会自动更正任何无效内容并继续处理文档,确保用户可以在最小干扰的情况下查看网页。

CSS解释

CSS解释的过程涉及几个关键步骤:

  • **解析CSS文件:**浏览器解析外部CSS文件,
  • 创建 StyleSheet 对象: 每个解析的 CSS 文件都会转换为 StyleSheet 对象。每个StyleSheet对象都封装了CSS规则,包括选择器和相应的CSS声明。这种结构化表示允许有效访问和操作样式。
  • 解析技术: CSS 解析器可以利用自上而下或自下而上的解析技术,具体取决于所使用的特定解析器生成器。这些技术决定了解析器如何读取和处理CSS规则,影响解析过程的效率和准确性。 How does the internet work? Part 1

通过这种解释,浏览器可以全面了解如何将样式应用于 DOM 中的 HTML 元素,从而促进网页呈现出预期的视觉呈现效果。

页面渲染

网页的渲染过程涉及几个结构化步骤:

  • 创建帧树:渲染引擎通过遍历 DOM 节点并计算每个节点的 CSS 样式来构造“帧树”或“渲染树”。这棵树代表了页面的视觉结构。
  • 计算首选宽度:框架树中每个节点的首选宽度以自下而上的方式计算。这涉及到子节点的首选宽度以及节点的水平边距、边框和填充的总和。
  • 计算实际宽度:每个节点的实际宽度是通过根据需要在其子节点之间分配可用宽度,以自上而下的方法确定的。
  • 计算高度:通过应用文本换行并将子节点的高度以及节点的边距、边框和填充相加,自下而上计算每个节点的高度。
  • 确定节点坐标:使用前面步骤中收集的宽度和高度信息计算每个节点的坐标。
  • 处理复杂元素:对于浮动、绝对或相对定位或采用其他复杂功能的元素执行更复杂的计算。有关更多详细信息,请参阅 CSS2 的 CSS 规范和当前的 CSS 工作。
  • 创建图层: 创建图层是为了描述页面的哪些部分可以一起动画化,而无需重新光栅化。每个帧/渲染对象都分配给特定的图层。
  • 分配纹理:为页面的每一层分配纹理,以优化渲染性能。
  • 执行绘图命令:遍历各层的frame/render对象,并针对各自的层执行绘图命令。这种渲染可以由 CPU 处理,也可以使用 D2D (Direct2D) 或 SkiaGL 等技术直接在 GPU 上绘制。
  • *重用计算值:*渲染过程可以利用先前渲染网页的计算值,从而实现更高效的增量更改,并且需要更少的计算工作。
  • 合成图层:最终页面图层被发送到合成过程,在那里它们与其他可见内容相结合,例如浏览器镶边、iframe 和插件面板。
  • 最终渲染命令:计算最终图层位置,并通过 Direct3D 或 OpenGL 等图形 API 发出复合命令。 GPU命令缓冲区刷新到GPU进行异步渲染,并将完成的帧发送到窗口服务器进行显示。 How does the internet work? Part 1

GPU渲染

  • 在渲染过程中,图形计算任务可以利用通用CPU或专用图形处理器GPU。
  • 当利用 GPU 进行图形渲染计算时,图形软件层将工作负载划分为多个较小的任务。这种方法使他们能够充分利用 GPU 的大规模并行性,这对于渲染过程中所需的浮点计算特别有效。
  • GPU 擅长同时处理大量操作,使其非常适合高效、快速地渲染复杂的视觉内容。这种并行处理能力显着增强了性能,尤其是在涉及高分辨率图形、动画和实时渲染的应用程序中。
  • 因此,使用 GPU 不仅可以加快渲染过程,还可以在现代 Web 应用程序和图形密集型软件中实现更复杂的视觉效果和更流畅的用户体验。

How does the internet work? Part 1

该图像也是由 GPU 渲染的

渲染后和用户引发的执行

渲染过程完成后,浏览器执行由各种事件触发的 JavaScript 代码,例如计时机制(如 Google Doodle 动画)或用户交互(例如,在搜索框中输入查询并接收建议)。

  • Plugins: Additionally, plugins such as Flash or Java may also execute, although they typically do not run at this point on the Google homepage.
  • Network Requests: JavaScript scripts can initiate further network requests, fetching additional resources or data as needed.
  • DOM Modifications: These scripts have the ability to modify the existing page or its layout, which can lead to another round of page rendering and painting. This dynamic capability allows for interactive experiences, where content can change in real-time based on user actions or other conditions, enhancing the overall functionality and responsiveness of the web application. The interaction between JavaScript execution and the rendering engine is crucial for creating rich, engaging web experiences, allowing developers to build applications that respond intuitively to user input and changing contexts.

The above is the detailed content of How does the internet work? Part 1. For more information, please follow other related articles on the PHP Chinese website!

source:dev.to
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template