I found that the sizes of web pages obtained using wget in Linux and "Save As, All, Web Pages" in the browser (chrome) are different. The webpage obtained by wget is obviously smaller. Later, I found that using this command is the same as the webpage saved by "Save as, webpage, only html".
I currently want to use a program or command (no matter Linux or Windows) to obtain the same content as "Save As, All, Web Page". Is there any method?
Reply to discussion (solution)
This web page is saved as a browser function, and the file it saves actually contains all the referenced file paths. Modified, and saved all these resource files. This is a software function
This web page is saved as a browser function, and the saved files actually modify the paths of the referenced files in it. , and save all these resource files. This is a software function
---------------------------- -------------------------------------------------- -----------------------------
Is it possible that there is a relatively simple way to achieve similar functionality?
If you simulate browser operation and saving methods to achieve the first trouble, the second one feels like a copycat.
This web page is saved as a browser function. The file it saves actually changes the paths of the referenced files in it, and saves all these resource files. . This is the software function
--------------------------------------------- -------------------------------------------------- ----------------
Is it possible that there is a relatively simple way to achieve similar functionality?
If you simulate browser operation and saving methods to achieve the first trouble, the second one feels like a copycat.
I am not asking that these resources must be integrated into an html file. In fact, even if there is no integration, as long as these resources are not missing. There is something that requires knowledge in this area. I have never done a front-end before. I feel that this should not be a very difficult requirement, but I couldn't find a way after searching for a long time.
This webpage is saved as a browser function. The file it saves actually changes the paths of the referenced files in it, and these All resource files are saved. This is a software function
---------------------------------- -------------------------------------------------- -----------------------
Is it possible that there is a relatively simple way to achieve similar functionality?
If you simulate browser operation and saving methods to achieve the first trouble, the second one feels like a copycat.
I am not asking that these resources must be integrated into an html file. In fact, even if there is no integration, as long as these resources are not missing. There is something that requires knowledge in this area. I have never done a front-end before. I feel that this should not be a very difficult requirement, but I couldn't find a way after searching for a long time.
I have never done this before, and it feels not that simple. You need to crawl out the entire page image, js file, css file, and analyze the path references in it, such as defining an image path in html , and analysis of which image path is truly effective is defined in css.