How Webpack operates caching-JS Tutorial-php.cn

Home

Web Front-end

JS Tutorial

How Webpack operates caching

php中世界最好的语言

Jun 14, 2018 am 11:39 AM

webpack cache

This time I will show you how Webpack operates the cache, and what are the precautions for Webpack to operate the cache. The following is a practical case, let's take a look.

Preface

Recently I was looking at how webpack makes persistent cache content, and found that there are still some pitfalls in it. I just have time to sort and summarize them. After reading this article, you can roughly understand:

What is persistent caching and why do we need to do persistent caching?
How does webpack do persistent caching? ?
Some points to note when doing caching with webpack.

Persistence Cache

First of all, we need to explain what persistence cache is, in the context of the current popularity of applications that separate front and rear ends. Under the circumstances, front-end html, css, and js often exist on the server in the form of a static resource file, and data is obtained through interfaces to display dynamic content. This involves the issue of how the company deploys front-end code, so it involves an issue of update deployment. Should the page be deployed first, or the resources be deployed first?

Deploy the page first, then deploy the resources: During the time interval between the two deployments, if a user accesses the page, the old resource will be loaded in the new page structure, and the old version of the resource will be regarded as the new When the version is cached, the result is that the user accesses a page with a disordered style. Unless refreshed manually, the page will remain in a disordered state until the resource cache expires.

Deploy resources first, then deploy pages: During the deployment interval, users with locally cached resources of older versions visit the website. Since the requested page is an older version, the resource reference has not changed and the browser will use it directly. Local caching is normal, but when users without local caching or cache expired visit the website, the old version page will load the new version resource, resulting in page execution errors.

So we need a deployment strategy to ensure that when we update our online code, online users can transition smoothly and open our website correctly.

It is recommended to read this answer first: How to develop and deploy front-end code in a large company?

After you read the above answer, you will roughly understand that the more mature persistent caching solution now is to add a hash value after the name of the static resource, because the hash value generated each time the file is modified is different. The advantage of this is to publish files incrementally to avoid overwriting previous files and causing online user access to fail.

Because as long as the names of the static resources (css, js, img) published each time are unique, then I can:

For html files : Do not enable caching, put the html on your own server, turn off the server's cache, and your own server only provides html files and data interfaces
For static js, css, pictures, etc. File: Enable CDN and caching, and upload static resources to the CDN service provider. We can enable long-term caching of resources. Because the path of each resource is unique, it will not cause the resource to be overwritten, ensuring the stability of online user access. sex.
Every time an update is released, the static resources (js, css, img) are first transferred to the cdn service, and then the html file is uploaded. This ensures that old users can Normal access allows new users to see new pages.

The above briefly introduces the mainstream front-end persistent caching solution, so why do we need to do persistent caching?

When a user visits our site for the first time using a browser, the page introduces a variety of static resources. If we can achieve persistent caching, we can add Cache- in the http response header. control or Expires field to set the cache, the browser can cache these resources locally one by one.

If the user needs to request the same static resource again during subsequent visits, and the static resource has not expired, the browser can directly cache it locally instead of requesting the resource through the network.

How webpack does persistent caching

After briefly introducing persistent caching, the following is the focus, so how should we do persistent caching in webpack? Well, we need to do the following two things:

Ensure the uniqueness of the hash value, that is, generate a unique hash value for each packaged resource. As long as the packaged content is inconsistent, then The hash values are inconsistent.
To ensure the stability of the hash value, we need to ensure that when a module is modified, only the hash value of the affected packaged file changes, and the hash value of the packaged file unrelated to the module remains unchanged.

Hash file name is the first step to implement persistent caching. Currently, webpack has two ways to calculate hash ([hash] and [chunkhash])

hash means that each time webpack generates a unique hash value during the compilation process, it will be re-created after any file in the project is changed, and then webpack calculates the new hash value.
chunkhash is a hash value calculated based on the module, so changes to a certain file will only affect its own hash value and will not affect other files.

So if you simply package everything into the same file, then hash can satisfy you. If your project involves unpacking, loading in modules, etc. , then you need to use chunkhash to ensure that only the relevant file hash value changes after each update.

So our webpack configuration with persistent cache should look like this:

module.exports = {
 entry: __dirname + '/src/index.js',
 output: {
 path: __dirname + '/dist',
 filename: '[name].[chunkhash:8].js',
 }
}

Copy after login

The meaning of the above code is: use index.js as the entry point to package all the code into one The file is named index.xxxx.js and placed in the dist directory. Now we can generate a newly named file every time we update the project.

If you are dealing with simple scenarios, this is enough, but in large multi-page applications, we often need to optimize the performance of the page:

Separate business Code and third-party code: The reason why we separate business code and third-party code is because business code updates frequently and third-party code updates and iterations are slow, so we separate third-party code (libraries, frameworks) , which can make full use of the browser's cache to load third-party libraries.
On-demand loading: For example, when using React-Router, when the user needs to access a certain route, the corresponding component will be loaded. Then the user does not need to load it at the beginning. At this time, all routing components will be downloaded locally.
In multi-page applications, we can often extract common modules, such as header, footer, etc., so that when the page jumps, these common modules exist in the cache , you can load it directly instead of making a network request.

So how to unpack and load modules into modules requires webpack's built-in plug-in: CommonsChunkPlugin. Below I will use an example to explain how to configure webpack.

The code of this article is placed on my Github. If you are interested, you can download it and take a look:

git clone https://github.com/happylindz/blog.git
cd blog/code/multiple-page-webpack-demo
npm install

Copy after login

Before reading the following content, I strongly recommend that you read my previous article: In-depth understanding Webpack file packaging mechanism. Understanding the webpack file packaging mechanism will help you better implement persistent caching.

The example is roughly described like this: it consists of two pages pageA and pageB

// src/pageA.js
import componentA from './common/componentA';
// 使用到 jquery 第三方库，需要抽离，避免业务打包文件过大
import $ from 'jquery';
// 加载 css 文件，一部分为公共样式，一部分为独有样式，需要抽离
import './css/common.css'
import './css/pageA.css';
console.log(componentA);
console.log($.trim(' do something '));
// src/pageB.js
// 页面 A 和 B 都用到了公共模块 componentA，需要抽离，避免重复加载
import componentA from './common/componentA';
import componentB from './common/componentB';
import './css/common.css'
import './css/pageB.css';
console.log(componentA);
console.log(componentB);
// 用到异步加载模块 asyncComponent，需要抽离，加载首屏速度
document.getElementById('xxxxx').addEventListener('click', () => {
 import( /* webpackChunkName: "async" */
 './common/asyncComponent.js').then((async) => {
  async();
 })
})
// 公共模块基本长这样
export default "component X";

Copy after login

The above page content basically involves the three modes of splitting our modules: splitting the public library , loading and splitting common modules on demand. Then the next step is to configure webpack:

const path = require('path');
const webpack = require('webpack');
const ExtractTextPlugin = require('extract-text-webpack-plugin');
module.exports = {
 entry: {
 pageA: [path.resolve(__dirname, './src/pageA.js')],
 pageB: path.resolve(__dirname, './src/pageB.js'),
 },
 output: {
 path: path.resolve(__dirname, './dist'),
 filename: 'js/[name].[chunkhash:8].js',
 chunkFilename: 'js/[name].[chunkhash:8].js'
 },
 module: {
 rules: [
  {
  // 用正则去匹配要用该 loader 转换的 CSS 文件
  test: /.css$/,
  use: ExtractTextPlugin.extract({
   fallback: "style-loader",
   use: ["css-loader"]
  }) 
  }
 ]
 },
 plugins: [
 new webpack.optimize.CommonsChunkPlugin({
  name: 'common',
  minChunks: 2,
 }),
 new webpack.optimize.CommonsChunkPlugin({
  name: 'vendor',
  minChunks: ({ resource }) => (
  resource && resource.indexOf('node_modules') >= 0 && resource.match(/.js$/)
  )
 }),
 new ExtractTextPlugin({
  filename: `css/[name].[chunkhash:8].css`,
 }),
 ]
}

Copy after login

The first CommonsChunkPlugin is used to extract public modules, which is equivalent to saying webpack boss, if you see a module being loaded twice or more, then please Help me move it to the common chunk. Here, minChunks is 2, and the granularity is the smallest. You can choose how many times to use the modules before extracting them according to your actual situation.

The second CommonsChunkPlugin is used to extract third-party codes, extract them, and determine whether the resources come from node_modules. If so, it means that they are third-party modules, then extract them. It is equivalent to telling the webpack boss that if you see some modules coming from the node_modules directory and their names end with .js, please move them to the vendor chunk. If the vendor chunk does not exist, create a new one. .

What are the benefits of this configuration? As our business grows, we are likely to rely on more and more third-party library codes. If we specially configure an entrance to store third-party code, then our webpack .config.js will become:

// 不利于拓展
module.exports = {
 entry: {
 app: './src/main.js',
 vendor: [
  'vue',
  'axio',
  'vue-router',
  'vuex',
  // more
 ],
 },
}

Copy after login

The third ExtractTextPlugin plug-in is used to extract css from the packaged js file and generate an independent css file. Imagine that when you just modify it The style does not modify the functional logic of the page. You definitely don't want the hash value of your js file to change. You definitely want css and js to be separated from each other and not affect each other.

运行 webpack 后可以看到打包之后的效果:

├── css
│ ├── common.2beb7387.css
│ ├── pageA.d178426d.css
│ └── pageB.33931188.css
└── js
 ├── async.03f28faf.js
 ├── common.2beb7387.js
 ├── pageA.d178426d.js
 ├── pageB.33931188.js
 └── vendor.22a1d956.js

Copy after login

可以看出 css 和 js 已经分离，并且我们对模块进行了拆分，保证了模块 chunk 的唯一性，当你每次更新代码的时候，会生成不一样的 hash 值。

唯一性有了，那么我们需要保证 hash 值的稳定性，试想下这样的场景，你肯定不希望你修改某部分的代码(模块，css)导致了文件的 hash 值全变了，那么显然是不明智的，那么我们去做到 hash 值变化最小化呢？

换句话说，我们就要找出 webpack 编译中会导致缓存失效的因素，想办法去解决或优化它？

影响 chunkhash 值变化主要由以下四个部分引起的：

包含模块的源代码
webpack 用于启动运行的 runtime 代码
webpack 生成的模块 moduleid(包括包含模块 id 和被引用的依赖模块 id)
chunkID

这四部分只要有任意部分发生变化，生成的分块文件就不一样了，缓存也就会失效，下面就从四个部分一一介绍：

一、源代码变化：

显然不用多说，缓存必须要刷新，不然就有问题了

二、webpack 启动运行的 runtime 代码：

看过我之前的文章：深入理解 webpack 文件打包机制就会知道，在 webpack 启动的时候需要执行一些启动代码。

(function(modules) {
 window["webpackJsonp"] = function webpackJsonpCallback(chunkIds, moreModules) {
 // ...
 };
 function __webpack_require__(moduleId) {
 // ...
 }
 __webpack_require__.e = function requireEnsure(chunkId, callback) {
 // ...
 script.src = __webpack_require__.p + "" + chunkId + "." + ({"0":"pageA","1":"pageB","3":"vendor"}[chunkId]||chunkId) + "." + {"0":"e72ce7d4","1":"69f6bbe3","2":"9adbbaa0","3":"53fa02a7"}[chunkId] + ".js";
 };
})([]);

Copy after login

大致内容像上面这样，它们是 webpack 的一些启动代码，它们是一些函数，告诉浏览器如何加载 webpack 定义的模块。

其中有一行代码每次更新都会改变的，因为启动代码需要清楚地知道 chunkid 和 chunkhash 值得对应关系，这样在异步加载的时候才能正确地拼接出异步 js 文件的路径。

那么这部分代码最终放在哪个文件呢？因为我们刚才配置的时候最后生成的 common chunk 模块，那么这部分运行时代码会被直接内置在里面，这就导致了，我们每次更新我们业务代码(pageA, pageB, 模块)的时候， common chunkhash 会一直变化，但是这显然不符合我们的设想，因为我们只是要用 common chunk 用来存放公共模块(这里指的是 componentA)，那么我 componentA 都没去修改，凭啥 chunkhash 需要变了。

所以我们需要将这部分 runtime 代码抽离成单独文件。

module.exports = {
 // ...
 plugins: [
 // ...
 // 放到其他的 CommonsChunkPlugin 后面
 new webpack.optimize.CommonsChunkPlugin({
  name: 'runtime',
  minChunks: Infinity,
 }),
 ]
}

Copy after login

这相当于是告诉 webpack 帮我把运行时代码抽离，放到单独的文件中。

├── css
│ ├── common.4cc08e4d.css
│ ├── pageA.d178426d.css
│ └── pageB.33931188.css
└── js
 ├── async.03f28faf.js
 ├── common.4cc08e4d.js
 ├── pageA.d178426d.js
 ├── pageB.33931188.js
 ├── runtime.8c79fdcd.js
 └── vendor.cef44292.js

Copy after login

多生成了一个 runtime.xxxx.js，以后你在改动业务代码的时候，common chunk 的 hash 值就不会变了，取而代之的是 runtime chunk hash 值会变，既然这部分代码是动态的，可以通过 chunk-manifest-webpack-plugin 将他们 inline 到 html 中，减少一次网络请求。

三、webpack 生成的模块 moduleid

在 webpack2 中默认加载 OccurrenceOrderPlugin 这个插件，OccurrenceOrderPlugin 插件会按引入次数最多的模块进行排序，引入次数的模块的 moduleId 越小，但是这仍然是不稳定的，随着你代码量的增加，虽然代码引用次数的模块 moduleId 越小，越不容易变化，但是难免还是不确定的。

默认情况下，模块的 id 是这个模块在模块数组中的索引。OccurenceOrderPlugin 会将引用次数多的模块放在前面，在每次编译时模块的顺序都是一致的，如果你修改代码时新增或删除了一些模块，这将可能会影响到所有模块的 id。

最佳实践方案是通过 HashedModuleIdsPlugin 这个插件，这个插件会根据模块的相对路径生成一个长度只有四位的字符串作为模块的 id，既隐藏了模块的路径信息，又减少了模块 id 的长度。

这样一来，改变 moduleId 的方式就只有文件路径的改变了，只要你的文件路径值不变，生成四位的字符串就不变，hash 值也不变。增加或删除业务代码模块不会对 moduleid 产生任何影响。

module.exports = {
 plugins: [
 new webpack.HashedModuleIdsPlugin(),
 // 放在最前面
 // ...
 ]
}

Copy after login

四、chunkID

实际情况中分块的个数的顺序在多次编译之间大多都是固定的, 不太容易发生变化。

这里涉及的只是比较基础的模块拆分，还有一些其它情况没有考虑到，比如异步加载组件中包含公共模块，可以再次将公共模块进行抽离。形成异步公共 chunk 模块。有想深入学习的可以看这篇文章：Webpack 大法之 Code Splitting

webpack 做缓存的一些注意点

CSS 文件 hash 值失效的问题
不建议线上发布使用 DllPlugin 插件

CSS 文件 hash 值失效的问题：

ExtractTextPlugin 有个比较严重的问题，那就是它生成文件名所用的[chunkhash]是直接取自于引用该 css 代码段的 js chunk ；换句话说，如果我只是修改 css 代码段，而不动 js 代码，那么最后生成出来的 css 文件名依然没有变化。

所以我们需要将 ExtractTextPlugin 中的 chunkhash 改为 contenthash，顾名思义，contenthash 代表的是文本文件内容的 hash 值，也就是只有 style 文件的 hash 值。这样编译出来的 js 和 css 文件就有独立的 hash 值了。

module.exports = {
 plugins: [
 // ...
 new ExtractTextPlugin({
  filename: `css/[name].[contenthash:8].css`,
 }),
 ]
}

Copy after login

如果你使用的是 webpack2，webpack3，那么恭喜你，这样就足够了，js 文件和 css 文件修改都不会影响到相互的 hash 值。那如果你使用的是 webpack1，那么就会出现问题。

具体来讲就是 webpack1 和 webpack 在计算 chunkhash 值得不同：

webpack1 在涉及的时候并没有考虑像 ExtractTextPlugin 会将模块内容抽离的问题，所以它在计算 chunkhash 的时候是通过打包之前模块内容去计算的，也就是说在计算的时候 css 内容也包含在内，之后才将 css 内容抽离成单独的文件，

那么就会出现：如果只修改了 css 文件，未修改引用的 js 文件，那么编译输出的 js 文件的 hash 值也会改变。

对此，webpack2 做了改进，它是基于打包后文件内容来计算 hash 值的，所以是在 ExtractTextPlugin 抽离 css 代码之后，所以就不存在上述这样的问题。如果不幸的你还在使用 webpack1，那么推荐你使用 md5-hash-webpack-plugin 插件来改变 webpack 计算 hash 的策略。

不建议线上发布使用 DllPlugin 插件

为什么这么说呢？因为最近有朋友来问我，他们 leader 不让在线上用 DllPlugin 插件，来问我为什么？

DllPlugin 本身有几个缺点：

首先你需要额外多配置一份 webpack 配置，增加工作量。
其中一个页面用到了一个体积很大的第三方依赖库而其它页面根本不需要用到，但若直接将它打包在 dll.js 里很不值得，每次页面打开都要去加载这段无用的代码，无法使用到 webpack2 的 Code Splitting 功能。
第一次打开的时候需要下载 dll 文件，因为你把很多库全部打在一起了，导致 dll 文件很大，首次进入页面加载速度很慢。

虽然你可以打包成 dll 文件，然后让浏览器去读取缓存，这样下次就不用再去请求，比如你用 lodash 其中一个函数，而你用dll会将整个 lodash 文件打进去，这就会导致你加载无用代码过多，不利于首屏渲染时间。

我认为的正确的姿势是：

像 React、Vue 这样整体性偏强的库，可以生成 vendor 第三方库来去做缓存，因为你一般技术体系是固定的，一个站点里面基本上都会用到统一技术体系，所以生成 vendor 库用于缓存。
像 antd、lodash 这种功能性组件库，可以通过 tree shaking 来进行消除，只保留有用的代码，千万不要直接打到 vendor 第三方库里，不然你将大量执行无用的代码。

相信看了本文案例你已经掌握了方法，更多精彩请关注php中文网其它相关文章！

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055523 fails to install in Windows 11?

3 weeks ago By DDD

How to fix KB5055518 fails to install in Windows 10?

3 weeks ago By DDD

Strength Levels for Every Enemy & Monster in R.E.P.O.

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Roblox: Dead Rails - How To Tame Wolves

3 weeks ago By DDD

Blue Prince: How To Get To The Basement

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial

1655

CakePHP Tutorial

1413

Laravel Tutorial

1306

PHP Tutorial

1252

C# Tutorial

1225

Related knowledge

Where are video files stored in browser cache? Feb 19, 2024 pm 05:09 PM

Which folder does the browser cache the video in? When we use the Internet browser every day, we often watch various online videos, such as watching music videos on YouTube or watching movies on Netflix. These videos will be cached by the browser during the loading process so that they can be loaded quickly when played again in the future. So the question is, in which folder are these cached videos actually stored? Different browsers store cached video folders in different locations. Below we will introduce several common browsers and their

How to view and refresh dns cache in Linux Mar 07, 2024 am 08:43 AM

DNS (DomainNameSystem) is a system used on the Internet to convert domain names into corresponding IP addresses. In Linux systems, DNS caching is a mechanism that stores the mapping relationship between domain names and IP addresses locally, which can increase the speed of domain name resolution and reduce the burden on the DNS server. DNS caching allows the system to quickly retrieve the IP address when subsequently accessing the same domain name without having to issue a query request to the DNS server each time, thereby improving network performance and efficiency. This article will discuss with you how to view and refresh the DNS cache on Linux, as well as related details and sample code. Importance of DNS Caching In Linux systems, DNS caching plays a key role. its existence

Speed up your applications: A simple guide to Guava caching Jan 31, 2024 pm 09:11 PM

A Beginner's Guide to Guava Cache: Speed Up Your Applications Guava Cache is a high-performance in-memory caching library that can significantly improve application performance. It provides a variety of caching strategies, including LRU (least recently used), LFU (least recently used), and TTL (time to live). 1. Install Guava cache and add the dependency of Guava cache library to your project. com.goog

The relationship between CPU, memory and cache is explained in detail! Mar 07, 2024 am 08:30 AM

There is a close interaction between the CPU (central processing unit), memory (random access memory), and cache, which together form a critical component of a computer system. The coordination between them ensures the normal operation and efficient performance of the computer. As the brain of the computer, the CPU is responsible for executing various instructions and data processing; the memory is used to temporarily store data and programs, providing fast read and write access speeds; and the cache plays a buffering role, speeding up data access speed and improving The computer's CPU is the core component of the computer and is responsible for executing various instructions, arithmetic operations, and logical operations. It is called the "brain" of the computer and plays an important role in processing data and performing tasks. Memory is an important storage device in a computer.

Will HTML files be cached? Feb 19, 2024 pm 01:51 PM

Title: Caching mechanism and code examples of HTML files Introduction: When writing web pages, we often encounter browser cache problems. This article will introduce the caching mechanism of HTML files in detail and provide some specific code examples to help readers better understand and apply this mechanism. 1. Browser caching principle In the browser, whenever a web page is accessed, the browser will first check whether there is a copy of the web page in the cache. If there is, the web page content is obtained directly from the cache. This is the basic principle of browser caching. Benefits of browser caching mechanism

Spring Boot performance optimization tips: create applications as fast as the wind Feb 25, 2024 pm 01:01 PM

SpringBoot is a popular Java framework known for its ease of use and rapid development. However, as the complexity of the application increases, performance issues can become a bottleneck. In order to help you create a springBoot application as fast as the wind, this article will share some practical performance optimization tips. Optimize startup time Application startup time is one of the key factors of user experience. SpringBoot provides several ways to optimize startup time, such as using caching, reducing log output, and optimizing classpath scanning. You can do this by setting spring.main.lazy-initialization in the application.properties file

Advanced Usage of PHP APCu: Unlocking the Hidden Power Mar 01, 2024 pm 09:10 PM

PHPAPCu (replacement of php cache) is an opcode cache and data cache module that accelerates PHP applications. Understanding its advanced features is crucial to utilizing its full potential. 1. Batch operation: APCu provides a batch operation method that can process a large number of key-value pairs at the same time. This is useful for large-scale cache clearing or updates. //Get cache keys in batches $values=apcu_fetch(["key1","key2","key3"]); //Clear cache keys in batches apcu_delete(["key1","key2","key3"]);2 .Set cache expiration time: APCu allows you to set an expiration time for cache items so that they automatically expire after a specified time.

How to save video files from browser cache to local Feb 23, 2024 pm 06:45 PM

How to Export Browser Cache Videos With the rapid development of the Internet, videos have become an indispensable part of people's daily lives. When browsing the web, we often encounter video content that we want to save or share, but sometimes we cannot find the source of the video files because they may only exist in the browser's cache. So, how do you export videos from your browser cache? This article will introduce you to several common methods. First, we need to clarify a concept, namely browser cache. The browser cache is used by the browser to improve user experience.

See all articles