Deploy Elasticsearch based on memory storage - 100 million+ pieces of data, full-text search 100ms response-AI-php.cn

Table of Contents

1. Mount the memory storage directory on the host

2. Create PVC on Kubernetes cluster

3. Deploy Elasticsearch related components

4. Import data

5. 测试与验证

6. 总结

参考资料

Home

Technology peripherals

Deploy Elasticsearch based on memory storage - 100 million+ pieces of data, full-text search 100ms response

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 07, 2024 am 11:11 AM

cpu ai Computing power

1. Mount the memory storage directory on the host

Create a directory for mounting

mkdir /mnt/memory_storage

Copy after login

Mount the tmpfs file system

mount -t tmpfs -o size=800G tmpfs /mnt/memory_storage

Copy after login

The storage space will be used on demand, that is, 100G of memory will be occupied when 100G storage is used. There is 2T memory on the host node, and 800G memory is allocated here to store Elasticsearch data.

Create the directory in advance

mkdir /mnt/memory_storage/elasticsearch-data-es-jfs-prod-es-default-0mkdir /mnt/memory_storage/elasticsearch-data-es-jfs-prod-es-default-1mkdir /mnt/memory_storage/elasticsearch-data-es-jfs-prod-es-default-2

Copy after login

If the directory is not created in advance and given read and write permissions, the Elasticsearch component will not be able to start. Prompt that multiple nodes use the same data directory.

Configure directory permissions

chmod -R 777 /mnt/memory_storage

Copy after login

DD Test IO bandwidth

dd if=/dev/zero of=/mnt/memory_storage/dd.txt bs=4M count=25002500+0 records in2500+0 records out10485760000 bytes (10 GB, 9.8 GiB) copied, 3.53769 s, 3.0 GB/s

Copy after login

Clean files

rm -rf /mnt/memory_storage/dd.txt

Copy after login

FIO test IO bandwidth

fio --name=test --filename=/mnt/memory_storage/fio_test_file --size=10G --rw=write --bs=4M --numjobs=1 --runtime=60 --time_basedRun status group 0 (all jobs):WRITE: bw=2942MiB/s (3085MB/s), 2942MiB/s-2942MiB/s (3085MB/s-3085MB/s), io=172GiB (185GB), run=60001-60001msec

Copy after login

Clean files

rm -rf /mnt/memory_storage/fio_test_file

Copy after login

Test the memory IO bandwidth

mbw 10000Long uses 8 bytes. Allocating 2*1310720000 elements = 20971520000 bytes of memory.Using 262144 bytes as blocks for memcpy block copy test.Getting down to business... Doing 10 runs per test.0 Method: MEMCPY Elapsed: 1.62143 MiB: 10000.00000 Copy: 6167.380 MiB/s1 Method: MEMCPY Elapsed: 1.63542 MiB: 10000.00000 Copy: 6114.656 MiB/s2 Method: MEMCPY Elapsed: 1.63345 MiB: 10000.00000 Copy: 6121.997 MiB/s3 Method: MEMCPY Elapsed: 1.63715 MiB: 10000.00000 Copy: 6108.161 MiB/s4 Method: MEMCPY Elapsed: 1.64429 MiB: 10000.00000 Copy: 6081.667 MiB/s5 Method: MEMCPY Elapsed: 1.62772 MiB: 10000.00000 Copy: 6143.574 MiB/s6 Method: MEMCPY Elapsed: 1.60684 MiB: 10000.00000 Copy: 6223.379 MiB/s7 Method: MEMCPY Elapsed: 1.62499 MiB: 10000.00000 Copy: 6153.876 MiB/s8 Method: MEMCPY Elapsed: 1.63967 MiB: 10000.00000 Copy: 6098.770 MiB/s9 Method: MEMCPY Elapsed: 2.97213 MiB: 10000.00000 Copy: 3364.588 MiB/sAVG Method: MEMCPY Elapsed: 1.76431 MiB: 10000.00000 Copy: 5667.937 MiB/s0 Method: DUMB Elapsed: 1.01521 MiB: 10000.00000 Copy: 9850.140 MiB/s1 Method: DUMB Elapsed: 0.85378 MiB: 10000.00000 Copy: 11712.605 MiB/s2 Method: DUMB Elapsed: 0.82487 MiB: 10000.00000 Copy: 12123.167 MiB/s3 Method: DUMB Elapsed: 0.84520 MiB: 10000.00000 Copy: 11831.463 MiB/s4 Method: DUMB Elapsed: 0.83050 MiB: 10000.00000 Copy: 12040.968 MiB/s5 Method: DUMB Elapsed: 0.84932 MiB: 10000.00000 Copy: 11774.194 MiB/s6 Method: DUMB Elapsed: 0.82491 MiB: 10000.00000 Copy: 12122.505 MiB/s7 Method: DUMB Elapsed: 1.44235 MiB: 10000.00000 Copy: 6933.144 MiB/s8 Method: DUMB Elapsed: 2.68656 MiB: 10000.00000 Copy: 3722.225 MiB/s9 Method: DUMB Elapsed: 8.44667 MiB: 10000.00000 Copy: 1183.898 MiB/sAVG Method: DUMB Elapsed: 1.86194 MiB: 10000.00000 Copy: 5370.750 MiB/s0 Method: MCBLOCK Elapsed: 4.52486 MiB: 10000.00000 Copy: 2210.013 MiB/s1 Method: MCBLOCK Elapsed: 4.82467 MiB: 10000.00000 Copy: 2072.683 MiB/s2 Method: MCBLOCK Elapsed: 0.84797 MiB: 10000.00000 Copy: 11792.870 MiB/s3 Method: MCBLOCK Elapsed: 0.84980 MiB: 10000.00000 Copy: 11767.516 MiB/s4 Method: MCBLOCK Elapsed: 0.87665 MiB: 10000.00000 Copy: 11407.113 MiB/s5 Method: MCBLOCK Elapsed: 0.85952 MiB: 10000.00000 Copy: 11634.468 MiB/s6 Method: MCBLOCK Elapsed: 0.84132 MiB: 10000.00000 Copy: 11886.154 MiB/s7 Method: MCBLOCK Elapsed: 0.84970 MiB: 10000.00000 Copy: 11768.915 MiB/s8 Method: MCBLOCK Elapsed: 0.86918 MiB: 10000.00000 Copy: 11505.150 MiB/s9 Method: MCBLOCK Elapsed: 0.85996 MiB: 10000.00000 Copy: 11628.434 MiB/sAVG Method: MCBLOCK Elapsed: 1.62036 MiB: 10000.00000 Copy: 6171.467 MiB/s

Copy after login

It seems that the memory is mounted as the IO of the file system The bandwidth can only reach half of the IO bandwidth of the memory.

2. Create PVC on Kubernetes cluster

Configure environment variables

export NAMESPACE=data-centerexport PVC_NAME=elasticsearch-data-es-jfs-prod-es-default-0

Copy after login

Create PV and PVC

kubectl create -f - <<EOFapiVersion: v1kind: PersistentVolumemetadata:name: ${PVC_NAME}namespace: ${NAMESPACE}spec:accessModes:- ReadWriteManycapacity:storage: 800GihostPath:path: /mnt/memory_storage/${PVC_NAME}---apiVersion: v1kind: PersistentVolumeClaimmetadata:name: ${PVC_NAME}namespace: ${NAMESPACE}spec:accessModes:- ReadWriteManyresources:requests:storage: 800GiEOF

Copy after login

Create at least 3 PVC applications by modifying the PVC_NAME variable. Finally, I created 20 PVCs, providing a total of 15+ TB of storage.

Some content is omitted here. For details, please refer to Using JuiceFS to store Elasticsearch data[1].

Deploy Elasticsearch

cat <<EOF | kubectl apply -f -apiVersion: elasticsearch.k8s.elastic.co/v1kind: Elasticsearchmetadata:namespace: $NAMESPACEname: es-jfs-prodspec:version: 8.3.2image: hubimage/elasticsearch:8.3.2http:tls:selfSignedCertificate:disabled: truenodeSets:- name: defaultcount: 3config:node.store.allow_mmap: falseindex.store.type: niofspodTemplate:spec:nodeSelector:servertype: Ascend910B-24initContainers:- name: sysctlsecurityContext:privileged: truerunAsUser: 0command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']- name: install-pluginscommand:- sh- -c- |bin/elasticsearch-plugin install --batch https://get.infini.cloud/elasticsearch/analysis-ik/8.3.2securityContext:runAsUser: 0runAsGroup: 0containers:- name: elasticsearchreadinessProbe:exec:command:- bash- -c- /mnt/elastic-internal/scripts/readiness-probe-script.shfailureThreshold: 10initialDelaySeconds: 30periodSeconds: 30successThreshold: 1timeoutSeconds: 30env:- name: "ES_JAVA_OPTS"value: "-Xms31g -Xmx31g"- name: "NSS_SDB_USE_CACHE"value: "no"resources:requests:cpu: 8memory: 64GiEOF

Copy after login

View Elasticsearch password

kubectl -n $NAMESPACE get secret es-jfs-prod-es-elastic-user -o go-template='{{.data.elastic | base64decode}}'xxx

Copy after login

Default The username is elastic

Deploy Metricbeat

kubectl apply -f - <<EOFapiVersion: beat.k8s.elastic.co/v1beta1kind: Beatmetadata:name: es-jfs-prodnamespace: $NAMESPACEspec:type: metricbeatversion: 8.3.2elasticsearchRef:name: es-jfs-prodconfig:metricbeat:autodiscover:providers:- type: kubernetesscope: clusterhints.enabled: truetemplates:- config:- module: kubernetesmetricsets:- eventperiod: 10sprocessors:- add_cloud_metadata: {}logging.json: truedeployment:podTemplate:spec:serviceAccountName: metricbeatautomountServiceAccountToken: true# required to read /etc/beat.ymlsecurityContext:runAsUser: 0EOF

Copy after login

Deploy Kibana

cat <<EOF | kubectl apply -f -apiVersion: kibana.k8s.elastic.co/v1kind: Kibanametadata:namespace: $NAMESPACEname: es-jfs-prodspec:version: 8.3.2count: 1image: hubimage/kibana:8.3.2elasticsearchRef:name: es-jfs-prodhttp:tls:selfSignedCertificate:disabled: trueEOF

Copy after login

View Elasticsearch cluster information

部署基于内存存储的 Elasticsearch - 一亿+条数据，全文检索 100ms 响应 Picture

4. Import data

Create index

Execute in the Dev Tools page of Elasticsearch Management:

PUT /bayou_tt_articles{"settings": {"index": {"number_of_shards": 30,"number_of_replicas": 1,"refresh_interval": "120s","translog.durability": "async","translog.sync_interval": "120s","translog.flush_threshold_size": "2048M"}},"mappings": {"properties": {"text": {"type": "text","analyzer": "ik_smart"}}}}

Copy after login

There are two things to note:

Keep each point The slice size is between 10-50G, here number_of_shards is set to 30, because there are a few hundred GB of data that need to be imported.
The number of replicas is at least 1 to ensure that the Pod will not lose data during rolling updates. When the IP of the Pod changes, Elasticsearch will consider it a new node and cannot reuse the previous data. If there is no copy to rebuild the shards at this time, data loss will occur.

Install the import tool

You can also use the elasticdump container to import, and there are examples below. Installed here using npm.

apt-get install npm -y

Copy after login

npm install elasticdump -g

Copy after login

Import data

export DATAPATH=./bayou_tt_articles_0.jsonlnohup elasticdump --limit 20000 --input=${DATAPATH} --output=http://elastic:xxx@x.x.x.x:31391/ --output-index=bayou_tt_articles --type=data --transform="doc._source=Object.assign({},doc)" > elasticdump-${DATAPATH}.log 2>&1 &

Copy after login

limit 表示每次导入的数据条数，默认值是 100 太小，建议在保障导入成功的前提下尽可能大一点。

查看索引速率

部署基于内存存储的 Elasticsearch - 一亿+条数据，全文检索 100ms 响应图片

索引速率达到 1w+/s，但上限远不止于此。因为，根据社区文档的压力测试结果显示，单个节点至少能提供 2W/s 的索引速率。

5. 测试与验证

全文检索性能显著提升

部署基于内存存储的 Elasticsearch - 一亿+条数据，全文检索 100ms 响应图片

上图是使用 JuiceFS 存储的全文检索速度为 18s，使用 SSD 节点的 Elasticsearch 的全文检索速度为 5s。下图是使用内存存储的 Elasticsearch 的全文检索速度为 100ms 左右。

部署基于内存存储的 Elasticsearch - 一亿+条数据，全文检索 100ms 响应图片

更新 Elasticsearch 不会丢数据

之前给 Elasticsearch Pod 分配的 CPU 和 Memory 太多，调整为 CPU 32C，Memory 64 GB。在滚动更新过程中，Elasticsearch 始终可用，并且数据没有丢失。

但务必注意设置 replicas > 1，尽量不要自行重启 Pod，虽然 Pod 是原节点更新。

能平稳实现节点的扩容

部署基于内存存储的 Elasticsearch - 一亿+条数据，全文检索 100ms 响应图片

由于业务总的 Elasticsearch 存储需求是 10T 左右，我继续增加节点到 10 个，Elasticsearch 的索引分片会自动迁移，均匀分布在这些节点上。

导出索引速度达 1w 条每秒

docker run --rm -ti elasticdump/elasticsearch-dump --limit 10000 --input=http://elastic:xxx@x.x.x.x:31391/bayou_tt_articles --output=/data/es-bayou_tt_articles-output.json --type=data

Copy after login

Wed, 29 May 2024 01:41:23 GMT | got 10000 objects from source elasticsearch (offset: 0)Wed, 29 May 2024 01:41:23 GMT | sent 10000 objects to destination file, wrote 10000Wed, 29 May 2024 01:41:24 GMT | got 10000 objects from source elasticsearch (offset: 10000)Wed, 29 May 2024 01:41:24 GMT | sent 10000 objects to destination file, wrote 10000Wed, 29 May 2024 01:41:25 GMT | got 10000 objects from source elasticsearch (offset: 20000)Wed, 29 May 2024 01:41:25 GMT | sent 10000 objects to destination file, wrote 10000Wed, 29 May 2024 01:41:25 GMT | got 10000 objects from source elasticsearch (offset: 30000)

Copy after login

导出速度能达到 1w 条每秒，一亿条数据大约需要 3h，基本也能满足索引的备份、迁移需求。

Elasticsearch 节点 Pod 更新时，不会发生漂移

更新之前的 Pod 分布节点如下：

NAME READY STATUSRESTARTSAGE IP NODE NOMINATED NODE READINESS GATESes-jfs-prod-beat-metricbeat-7fbdd657c4-djgg6 1/1 Running 6 (32m ago) 18h 10.244.54.5ascend-01 <none> <none>es-jfs-prod-es-default-0 1/1 Running 0 28m 10.244.46.82 ascend-07 <none> <none>es-jfs-prod-es-default-1 1/1 Running 0 29m 10.244.23.77 ascend-53 <none> <none>es-jfs-prod-es-default-2 1/1 Running 0 31m 10.244.49.65 ascend-20 <none> <none>es-jfs-prod-es-default-3 1/1 Running 0 32m 10.244.54.14 ascend-01 <none> <none>es-jfs-prod-es-default-4 1/1 Running 0 34m 10.244.100.239 ascend-40 <none> <none>es-jfs-prod-es-default-5 1/1 Running 0 35m 10.244.97.201ascend-39 <none> <none>es-jfs-prod-es-default-6 1/1 Running 0 37m 10.244.101.156 ascend-38 <none> <none>es-jfs-prod-es-default-7 1/1 Running 0 39m 10.244.19.101ascend-49 <none> <none>es-jfs-prod-es-default-8 1/1 Running 0 40m 10.244.16.109ascend-46 <none> <none>es-jfs-prod-es-default-9 1/1 Running 0 41m 10.244.39.119ascend-15 <none> <none>es-jfs-prod-kb-75f7bbd96-6tcrn 1/1 Running 0 18h 10.244.1.164 ascend-22 <none> <none>

Copy after login

更新之后的 Pod 分布节点如下：

NAME READY STATUSRESTARTSAGE IP NODE NOMINATED NODE READINESS GATESes-jfs-prod-beat-metricbeat-7fbdd657c4-djgg6 1/1 Running 6 (50m ago) 18h 10.244.54.5ascend-01 <none> <none>es-jfs-prod-es-default-0 1/1 Running 0 72s 10.244.46.83 ascend-07 <none> <none>es-jfs-prod-es-default-1 1/1 Running 0 2m35s 10.244.23.78 ascend-53 <none> <none>es-jfs-prod-es-default-2 1/1 Running 0 3m59s 10.244.49.66 ascend-20 <none> <none>es-jfs-prod-es-default-3 1/1 Running 0 5m34s 10.244.54.15 ascend-01 <none> <none>es-jfs-prod-es-default-4 1/1 Running 0 7m21s 10.244.100.240 ascend-40 <none> <none>es-jfs-prod-es-default-5 1/1 Running 0 8m44s 10.244.97.202ascend-39 <none> <none>es-jfs-prod-es-default-6 1/1 Running 0 10m 10.244.101.157 ascend-38 <none> <none>es-jfs-prod-es-default-7 1/1 Running 0 11m 10.244.19.102ascend-49 <none> <none>es-jfs-prod-es-default-8 1/1 Running 0 13m 10.244.16.110ascend-46 <none> <none>es-jfs-prod-es-default-9 1/1 Running 0 14m 10.244.39.120ascend-15 <none> <none>es-jfs-prod-kb-75f7bbd96-6tcrn 1/1 Running 0 18h 10.244.1.164 ascend-22 <none> <none>

Copy after login

这点打消了我的一个顾虑， Elasticsearch 的 Pod 重启时，发生了漂移，那么节点上是否会残留分片的数据，导致内存使用不断膨胀？答案是，不会。ECK Operator 似乎能让 Pod 在原节点进行重启，挂载的 Hostpath 数据依然对新的 Pod 有效，仅当主机节点发生重启时，才会丢失数据。

6. 总结

AI 的算力节点有大量空闲的 CPU 和 Memory 资源，使用这些大内存的主机节点，部署一些短生命周期的基于内存存储的高性能应用，有利于提高资源的使用效率。

本篇主要介绍了借助于 Hostpath 的内存存储部署 Elasticsearch 提供高性能查询能力的方案，具体内容如下：

将内存 mount 目录到主机上
创建基于 Hostpath 的 PVC，将数据挂载到上述目录
使用 ECK Operator 部署 Elasticsearch
Elasticsearch 更新时，数据并不会丢失，但不能同时重启多个主机节点
300+GB、一亿+条数据，全文检索响应场景中，基于 JuiceFS 存储的速度为 18s， SSD 节点的速度为 5s，内存节点的速度为 100ms

参考资料

[1]使用 JuiceFS 存储 Elasticsearch 数据: https://www.chenshaowen.com/blog/store-elasticsearch-data-in-juicefs.html

The above is the detailed content of Deploy Elasticsearch based on memory storage - 100 million+ pieces of data, full-text search 100ms response. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055523 fails to install in Windows 11?

4 weeks ago By DDD

How to fix KB5055518 fails to install in Windows 10?

4 weeks ago By DDD

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks ago By DDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial

1664

CakePHP Tutorial

1423

Laravel Tutorial

1317

PHP Tutorial

1268

C# Tutorial

1242

Related knowledge

How to understand DMA operations in C? Apr 28, 2025 pm 10:09 PM

DMA in C refers to DirectMemoryAccess, a direct memory access technology, allowing hardware devices to directly transmit data to memory without CPU intervention. 1) DMA operation is highly dependent on hardware devices and drivers, and the implementation method varies from system to system. 2) Direct access to memory may bring security risks, and the correctness and security of the code must be ensured. 3) DMA can improve performance, but improper use may lead to degradation of system performance. Through practice and learning, we can master the skills of using DMA and maximize its effectiveness in scenarios such as high-speed data transmission and real-time signal processing.

How to use the chrono library in C? Apr 28, 2025 pm 10:18 PM

Using the chrono library in C can allow you to control time and time intervals more accurately. Let's explore the charm of this library. C's chrono library is part of the standard library, which provides a modern way to deal with time and time intervals. For programmers who have suffered from time.h and ctime, chrono is undoubtedly a boon. It not only improves the readability and maintainability of the code, but also provides higher accuracy and flexibility. Let's start with the basics. The chrono library mainly includes the following key components: std::chrono::system_clock: represents the system clock, used to obtain the current time. std::chron

Quantitative Exchange Ranking 2025 Top 10 Recommendations for Digital Currency Quantitative Trading APPs Apr 30, 2025 pm 07:24 PM

The built-in quantization tools on the exchange include: 1. Binance: Provides Binance Futures quantitative module, low handling fees, and supports AI-assisted transactions. 2. OKX (Ouyi): Supports multi-account management and intelligent order routing, and provides institutional-level risk control. The independent quantitative strategy platforms include: 3. 3Commas: drag-and-drop strategy generator, suitable for multi-platform hedging arbitrage. 4. Quadency: Professional-level algorithm strategy library, supporting customized risk thresholds. 5. Pionex: Built-in 16 preset strategy, low transaction fee. Vertical domain tools include: 6. Cryptohopper: cloud-based quantitative platform, supporting 150 technical indicators. 7. Bitsgap:

How to handle high DPI display in C? Apr 28, 2025 pm 09:57 PM

Handling high DPI display in C can be achieved through the following steps: 1) Understand DPI and scaling, use the operating system API to obtain DPI information and adjust the graphics output; 2) Handle cross-platform compatibility, use cross-platform graphics libraries such as SDL or Qt; 3) Perform performance optimization, improve performance through cache, hardware acceleration, and dynamic adjustment of the details level; 4) Solve common problems, such as blurred text and interface elements are too small, and solve by correctly applying DPI scaling.

What is real-time operating system programming in C? Apr 28, 2025 pm 10:15 PM

C performs well in real-time operating system (RTOS) programming, providing efficient execution efficiency and precise time management. 1) C Meet the needs of RTOS through direct operation of hardware resources and efficient memory management. 2) Using object-oriented features, C can design a flexible task scheduling system. 3) C supports efficient interrupt processing, but dynamic memory allocation and exception processing must be avoided to ensure real-time. 4) Template programming and inline functions help in performance optimization. 5) In practical applications, C can be used to implement an efficient logging system.

How to use string streams in C? Apr 28, 2025 pm 09:12 PM

The main steps and precautions for using string streams in C are as follows: 1. Create an output string stream and convert data, such as converting integers into strings. 2. Apply to serialization of complex data structures, such as converting vector into strings. 3. Pay attention to performance issues and avoid frequent use of string streams when processing large amounts of data. You can consider using the append method of std::string. 4. Pay attention to memory management and avoid frequent creation and destruction of string stream objects. You can reuse or use std::stringstream.

How to measure thread performance in C? Apr 28, 2025 pm 10:21 PM

Measuring thread performance in C can use the timing tools, performance analysis tools, and custom timers in the standard library. 1. Use the library to measure execution time. 2. Use gprof for performance analysis. The steps include adding the -pg option during compilation, running the program to generate a gmon.out file, and generating a performance report. 3. Use Valgrind's Callgrind module to perform more detailed analysis. The steps include running the program to generate the callgrind.out file and viewing the results using kcachegrind. 4. Custom timers can flexibly measure the execution time of a specific code segment. These methods help to fully understand thread performance and optimize code.

An efficient way to batch insert data in MySQL Apr 29, 2025 pm 04:18 PM

Efficient methods for batch inserting data in MySQL include: 1. Using INSERTINTO...VALUES syntax, 2. Using LOADDATAINFILE command, 3. Using transaction processing, 4. Adjust batch size, 5. Disable indexing, 6. Using INSERTIGNORE or INSERT...ONDUPLICATEKEYUPDATE, these methods can significantly improve database operation efficiency.

See all articles