使用 Knex.js 批量更新记录的 QL 方法-js教程-PHP中文网

QL Approaches to Bulk Update Records with Knex.js

在 Web 开发领域，有效地使用数据库至关重要，尤其是在处理批量操作（例如一次更新多个记录）时。无论您是管理库存、处理用户数据还是处理交易，以高效可靠的方式执行批量更新的能力都是至关重要的。

在本指南中，我们将详细介绍使用 Knex.js（Node.js 的多功能查询构建器）批量更新记录的三种基本 SQL 技术。每种方法都针对不同的场景量身定制，根据您的具体用例提供独特的优势。我们将介绍：

具有多个条件的单个更新：一种允许您在单个查询中更新多个记录的方法，利用条件逻辑根据特定条件应用不同的更新。
在事务中使用单个查询进行批量更新：这种方法利用事务来确保原子性，安全高效地执行多个更新查询。
使用 onConflict 进行更新插入（插入或更新）：非常适合需要插入新记录或更新现有记录而不冒重复数据风险的场景。

在以下部分中，我们将更深入地研究每种方法，检查它们的实现、优点和最佳用例。通过了解这些方法，您可以选择最适合您的特定需求的技术，从而优化应用程序中的性能和数据完整性。

1. 多条件单次更新

当涉及到更新数据库中的多条记录时，效率是关键。一项强大的技术是使用具有多个条件的单个 UPDATE 查询。当您需要根据特定条件对不同记录应用不同的更新（所有这些都在单个 SQL 语句中）时，此方法特别有用。

概念：

“多条件单次更新”方法背后的核心思想是使用单个 UPDATE 查询来修改多行，每行可能根据其独特的特征接收不同的值。这是通过在 UPDATE 查询中使用 CASE 语句来实现的，允许您为需要更新的每个字段指定条件逻辑。

为什么使用这种方法：

效率：对于少量到中等数量的记录（例如几十到几百），将多个更新合并到单个查询中可以通过减少数量来显着提高性能数据库往返次数。这在处理高频更新时尤其有用。然而，对于非常大的数据集（数千或更多），这种方法可能不那么有效。我们稍后将在本指南中讨论处理大型数据集的替代方法。
简单性：与执行多个单独的查询相比，使用单个查询管理更新通常更简单且更易于维护。这种方法降低了数据库交互的复杂性，并使代码更易于理解，尤其是在处理适度数量的更新时。
减少开销：更少的查询意味着更少的数据库开销，这可以带来更好的整体性能。这在网络延迟或数据库负载可能影响操作速度的情况下尤其重要。
对于大量记录，我们在本指南中探讨了其他策略，以更有效地管理潜在开销。

示例实现：

下面是一个实际示例，说明如何使用 Knex.js（Node.js 的流行 SQL 查询构建器）来实现此方法。此示例演示如何一次性更新多条记录的多个字段，使用条件逻辑根据记录的 ID 应用不同的更新：

const queryHeaderProductUpdate = 'UPDATE products SET '; // Start of the SQL UPDATE query
const updatesProductUpdate = []; // Array to hold the individual update statements
const parametersProductUpdate = []; // Array to hold the parameters for the query

const updateProducts = [
  { product_id: 1, name: 'New Name 1', price: 100, status: 'Active' },
  { product_id: 2, name: 'New Name 2', price: 150, status: 'Inactive' },
  { product_id: 3, name: 'New Name 3', price: 200, status: 'Active' }
];

// Extract the product IDs to use in the WHERE clause
const productIds = updateProducts.map(p => p.product_id);

// Build the update statements for each field
updateProducts.forEach((item) => {
  // Add conditional logic for updating the 'name' field
  updatesProductUpdate.push('name = CASE WHEN product_id = ? THEN ? ELSE name END');
  parametersProductUpdate.push(item.product_id, item.name);

  // Add conditional logic for updating the 'price' field
  updatesProductUpdate.push('price = CASE WHEN product_id = ? THEN ? ELSE price END');
  parametersProductUpdate.push(item.product_id, item.price);

  // Add conditional logic for updating the 'status' field
  updatesProductUpdate.push('status = CASE WHEN product_id = ? THEN ? ELSE status END');
  parametersProductUpdate.push(item.product_id, item.status);

  // Add 'updated_at' field with the current timestamp
  updatesProductUpdate.push('updated_at = ?');
  parametersProductUpdate.push(knex.fn.now());

  // Add 'updated_by' field with the user ID
  updatesProductUpdate.push('updated_by = ?');
  parametersProductUpdate.push(req.user.userId);
});

// Construct the full query by joining the individual update statements and adding the WHERE clause
const queryProductUpdate = `${queryHeaderProductUpdate + updatesProductUpdate.join(', ')} WHERE product_id IN (${productIds.join(', ')})`;

// Execute the update query
await db.raw(queryProductUpdate, parametersProductUpdate);

登录后复制

这段代码的作用：

构造查询标头：开始产品表的 UPDATE 语句。
构建条件更新：使用CASE语句根据product_id为每个字段指定不同的更新。
生成完整查询：组合更新语句和 WHERE 子句。
执行查询：运行构造的查询以将更新应用到指定记录。

通过实施此技术，您可以使用条件逻辑高效处理批量更新，使您的数据库操作更加简化和有效。

Note: In the provided example, we did not use a transaction because the operation involves a single SQL query. Since a single query inherently maintains data integrity and consistency, there's no need for an additional transaction. Adding a transaction would only increase overhead without providing additional benefits in this context.

Having explored the "Single Update with Multiple Conditions" approach, which works well for a moderate number of records and provides simplicity and efficiency, we now turn our attention to a different scenario. As datasets grow larger or when atomicity across multiple operations becomes crucial, managing updates effectively requires a more robust approach.

Batch Updates with Individual Queries in a Transaction is a method designed to address these needs. This approach involves executing multiple update queries within a single transaction, ensuring that all updates are applied atomically. Let's dive into how this method works and its advantages.

2. Batch Updates with Individual Queries in a Transaction

When dealing with bulk updates, especially for a large dataset, managing each update individually within a transaction can be a robust and reliable approach. This method ensures that all updates are applied atomically and can handle errors gracefully.

Why Use This Approach:

Scalability: For larger datasets where Single Update with Multiple Conditions might become inefficient, batch updates with transactions offer better control. Each query is executed separately, and a transaction ensures that all changes are committed together, reducing the risk of partial updates.
Error Handling: Transactions provide a safety net by ensuring that either all updates succeed or none do. This atomicity guarantees data integrity, making it ideal for scenarios where you need to perform multiple related updates.
Concurrency Control: Using transactions can help manage concurrent modifications to the same records, preventing conflicts and ensuring consistency.

Code Example

Here’s how you can implement batch updates with individual queries inside a transaction using Knex.js:

const updateRecordsInBatch = async () => {
    // Example data to update
    const dataToUpdate = [
        { id: 1, name: 'Updated Name 1', price: 100 },
        { id: 2, name: 'Updated Name 2', price: 200 },
        { id: 3, name: 'Updated Name 3', price: 300 }
    ];

    // Start a transaction
    const trx = await db.transaction();
    const promises = [];

    try {
        // Iterate over the data and push update queries to the promises array
        dataToUpdate.forEach(record => {
            promises.push(
                trx('products')
                    .update({
                        name: record.name,
                        price: record.price,
                        updated_at: trx.fn.now()
                    })
                    .where('id', record.id)
            );
        });

        // Execute all queries concurrently
        await Promise.all(promises);

        // Commit the transaction
        await trx.commit();
        console.log('All records updated successfully.');
    } catch (error) {
        // Rollback the transaction in case of error
        await trx.rollback();
        console.error('Update failed:', error);
    }
};

登录后复制

Explanation

Transaction Initialization: The transaction is started using db.transaction(), which ensures that all subsequent queries are executed within this transaction.
Batch Updates: Each update query is constructed and added to an array of promises. This method allows for multiple updates to be performed concurrently.
Executing Queries: Promise.all(promises) is used to execute all update queries concurrently. This approach ensures that all updates are sent to the database in parallel.
Committing or Rolling Back: If all queries succeed, the transaction is committed with trx.commit(). If any query fails, the transaction is rolled back with trx.rollback(), ensuring that no partial updates are applied.

Using batch updates with individual queries inside a transaction provides a reliable way to manage large datasets. It ensures data integrity through atomic transactions and offers better control over concurrent operations. This method is especially useful when Single Update with Multiple Conditions may not be efficient for very large datasets.

3. Upsert (Insert or Update) Using onConflict

When you're working with data that might need to be inserted or updated depending on its existence in the database, an "upsert" operation is the ideal solution. This approach allows you to handle both scenarios—insert new records or update existing ones—in a single, streamlined operation. It's particularly useful when you want to maintain data consistency without having to write separate logic for checking whether a record exists.

Why Use This Approach:

Simplicity: An upsert enables you to combine the insert and update operations into a single query, simplifying your code and reducing the need for additional checks.
Efficiency: This method is more efficient than performing separate insert and update operations, as it minimizes database round-trips and handles conflicts automatically.
Conflict Handling: The onConflict clause lets you specify how to handle conflicts, such as when records with unique constraints already exist, by updating the relevant fields.

const productData = [
  {
    product_id: 1,
    store_id: 101,
    product_name: 'Product A',
    price: 10.99,
    category: 'Electronics',
  },
  {
    product_id: 2,
    store_id: 102,
    product_name: 'Product B',
    price: 12.99,
    category: 'Books',
  },
  {
    product_id: 3,
    store_id: 103,
    product_name: 'Product C',
    price: 9.99,
    category: 'Home',
  },
  {
    product_id: 4,
    store_id: 104,
    product_name: 'Product D',
    price: 15.49,
    category: 'Garden',
  },
];

await knex('products')
  .insert(productData)
  .onConflict(['product_id', 'store_id'])
  .merge({
    product_name: knex.raw('EXCLUDED.product_name'),
    price: knex.raw('EXCLUDED.price'),
    category: knex.raw('EXCLUDED.category'),
  });

登录后复制

Explanation

数据定义：我们定义productData，它是表示我们要插入或更新的产品记录的对象数组。每个对象包含一个product_id、store_id、product_name、价格和类别。
插入或更新：knex('products').insert(productData) 函数尝试将 ProductData 数组中的每条记录插入到 products 表中。
处理冲突：onConflict(['product_id', 'store_id']) 子句指定如果product_id 和 store_id 的组合发生冲突，则应执行下一步。
合并（冲突时更新）：当检测到冲突时，merge({...}) 方法会使用productData 中的新产品名称、价格和类别值更新现有记录。 knex.raw('EXCLUDED.column_name') 语法用于引用已插入的值，允许数据库使用这些值更新现有记录。

为了使 onConflict 子句在 upsert 操作中正确运行，所涉及的列必须是唯一约束的一部分。其工作原理如下：

单个唯一列：如果您在 onConflict 子句中使用单个列，则该列在整个表中必须是唯一的。这种唯一性保证了数据库可以根据该列准确检测记录是否已经存在。
多列：当 onConflict 子句中使用多列时，这些列的组合必须是唯一的。这种唯一性是通过唯一索引或约束来强制执行的，这可确保这些列的组合值在整个表中是唯一的。

索引和约束：
索引：一个或多个列上的唯一索引允许数据库有效地检查值的唯一性。当您定义唯一索引时，数据库将使用它来快速验证指定列中的值是否已存在。这使得 onConflict 子句能够准确地检测和处理冲突。

约束： 唯一约束可确保一列或多列中的值必须是唯一的。此约束对于 onConflict 子句的工作至关重要，因为它强制执行防止重复值的规则，并允许数据库根据这些列检测冲突。

与具有多个条件的单一更新方法类似，更新插入操作不需要事务。由于它涉及插入或更新记录的单个查询，因此它可以高效运行，而无需管理事务的额外开销。