Saya ingin memperkenalkan kedai middy, perpustakaan baharu yang saya bina sejak beberapa bulan lalu. Saya telah merenung idea ini untuk seketika, kembali kepada permintaan ciri ini yang saya buka lebih setahun yang lalu. middy-store ialah perisian tengah untuk Middy yang menyimpan dan memuatkan muatan secara automatik dari dan ke Kedai seperti Amazon S3 atau perkhidmatan lain yang berpotensi.
Perkhidmatan AWS mempunyai had tertentu yang mesti diketahui oleh seseorang. Contohnya, AWS Lambda mempunyai had muatan sebanyak 6MB untuk seruan segerak dan 256KB untuk seruan tak segerak. AWS Step Functions membenarkan saiz input atau output maksimum sebanyak 256KB data sebagai rentetan berkod UTF-8. Jika anda melebihi had ini apabila memulangkan data, anda akan menghadapi Negara yang terkenal.DataLimitExceeded pengecualian.
Penyelesaian biasa untuk pengehadan ini adalah dengan menyemak saiz muatan anda dan menyimpannya buat sementara waktu dalam storan berterusan seperti Amazon S3. Kemudian, anda mengembalikan URL objek atau ARN untuk S3. Lambda seterusnya menyemak sama ada terdapat URL atau ARN dalam input dan memuatkan muatan dari S3. Seperti yang boleh dibayangkan, ini menghasilkan banyak kod boilerplate untuk menyimpan dan memuatkan muatan dari dan ke Amazon S3, yang perlu diulang dalam setiap Lambda.
Ini menjadi lebih rumit apabila anda hanya mahu menyimpan sebahagian daripada muatan ke S3 dan membiarkan selebihnya seperti sedia ada. Contohnya, apabila bekerja dengan Step Functions, muatan boleh mengandungi data aliran kawalan untuk keadaan seperti Choice atau Map, yang perlu diakses terus. Ini bermakna Lambda pertama menyimpan muatan separa ke S3, dan Lambda seterusnya perlu memuatkan muatan separa daripada S3 dan menggabungkannya dengan muatan selebihnya. Ini memerlukan memastikan bahawa jenis adalah konsisten merentas pelbagai fungsi, yang sememangnya sangat terdedah kepada ralat.
middy-store ialah perisian tengah untuk Middy. Ia dilampirkan pada fungsi Lambda dan dipanggil dua kali semasa seruan Lambda: sebelum dan selepas pengendali Lambda() dijalankan. Ia menerima input sebelum pengendali berjalan dan menerima output daripada pengendali selepas ia selesai.
Mari kita mulakan pada penghujung dengan output selepas doa yang berjaya untuk menjadikannya lebih mudah untuk diikuti: middy-store menerima output (muatan muatan) daripada fungsi pengendali() dan menyemak saiznya. Untuk mengira saiz, ia menyelaraskan muatan, jika ia adalah objek, dan menggunakan Buffer.byteLength() untuk mengira saiz rentetan yang dikodkan UTF-8. Jika saiznya lebih besar daripada ambang boleh dikonfigurasikan tertentu, muatan disimpan dalam Kedai seperti Amazon S3. Rujukan kepada muatan yang disimpan (cth., URL S3 atau ARN) kemudiannya dikembalikan sebagai output dan bukannya output asal.
Sekarang mari kita lihat fungsi Lambda seterusnya (cth. dalam mesin keadaan), yang akan menerima output ini sebagai inputnya. Kali ini kita melihat input sebelum pengendali() dipanggil: middy-store menerima input kepada pengendali dan mencari rujukan kepada muatan yang disimpan. Jika ia menemui satu, muatan dimuatkan dari Stor dan dikembalikan sebagai input kepada pengendali. Pengendali menggunakan muatan seolah-olah ia dihantar terus kepadanya.
Berikut ialah contoh untuk menggambarkan cara kedai tengah berfungsi:
/* ./src/functions/handler1.ts */ export const handler1 = middy() .use( middyStore({ stores: [new S3Store({ /* S3 options */ })], }) ) .handler(async (input) => { // Return 1MB of random data as a base64 encoded string as output return randomBytes(1024 * 1024).toString('base64'); }); /* ./src/functions/handler2.ts */ export const handler2 = middy() .use( middyStore({ stores: [new S3Store({ /* S3 options */ })], }) ) .handler(async (input) => { // Print the size of the input return console.log(`Size: ${Buffer.from(input, "base64").byteLength / 1024 / 1024} MB`); }); /* ./src/workflow.ts */ // First Lambda returns a large output // It automatically uploads the data to S3 const output1 = await handler1({}); // Output is a reference to the S3 object: { "@middy-store": "s3://bucket/key"} console.log(output1); // Second Lambda receives the output as input // It automatically downloads the data from S3 const output2 = await handler2(output1);
Secara amnya, Kedai ialah sebarang perkhidmatan yang membolehkan anda menyimpan dan memuatkan muatan sewenang-wenangnya, seperti Amazon S3 atau sistem storan berterusan yang lain. Pangkalan data seperti DynamoDB juga boleh bertindak sebagai Kedai. The Store menerima muatan daripada pengendali Lambda, mensirikannya (jika ia objek), dan menyimpannya dalam storan berterusan. Apabila pengendali Lambda seterusnya memerlukan muatan, Kedai memuatkan muatan daripada storan, menyahsiri dan mengembalikannya.
middy-store berinteraksi dengan Kedai melalui antara muka StoreInterface, yang perlu dilaksanakan oleh setiap Kedai. Antara muka mentakrifkan fungsi canStore() dan store() untuk menyimpan muatan, dan canLoad() dan load() untuk memuatkan muatan.
interface StoreInterface<TPayload = unknown, TReference = unknown> { name: string; canLoad: (args: LoadArgs<unknown>) => boolean; load: (args: LoadArgs<TReference | unknown>) => Promise<TPayload>; canStore: (args: StoreArgs<TPayload>) => boolean; store: (args: StoreArgs<TPayload>) => Promise<TReference>; }
canStore() berfungsi sebagai pengawal untuk memeriksa sama ada Kedai boleh menyimpan muatan yang diberikan. Ia menerima muatan dan saiz baitnya dan menyemak sama ada muatan muat dalam had saiz maksimum Kedai. Contohnya, Kedai yang disokong oleh DynamoDB mempunyai saiz item maksimum 400KB, manakala kedai S3 secara berkesan tidak mempunyai had pada saiz muatan yang boleh disimpannya.
store() receives a payload and stores it in its underlying storage system. It returns a reference to the payload, which is a unique identifier to identify the stored payload within the underlying service. For example, the Amazon S3 Store uses an S3 URI in the format s3://
canLoad() acts like a filter to check if the Store can load a certain reference. It receives the reference to a stored payload and checks if it's a valid identifier for the underlying storage system. For example, the Amazon S3 Store checks if the reference is a valid S3 URI, while a DynamoDB Store would check if it's a valid ARN.
load() receives the reference to a stored payload and loads the payload from storage. Depending on the Store, the payload will be deserialized into its original type according to the metadata that was stored alongside it. For example, a payload of type application/json will get parsed back into a JSON object, while a plain string of type text/plain will remain unaltered.
Most of the time, you will only need one Store, like Amazon S3, which can effectively store any payload. However, middy-store lets you work with multiple Stores at the same time. This can be useful if you want to store different types of payloads in different Stores. For example, you might want to store large payloads in S3 and small payloads in DynamoDB.
middy-store accepts an Array
On the other hand, when middy-store runs after the handler and the output is larger than the maximum allowed size, it will iterate over the Stores and call canStore() for each Store. The first Store that returns true will be used to store the payload with store().
Therefore, it is important to note that the order of the Stores in the array is important.
When a payload is stored in a Store, middy-store will return a reference to the stored payload. The reference is a unique identifier to find the stored payload in the Store. The value of the identifier depends on the Store and its configuration. For example, the Amazon S3 Store will use an S3 URI by default. However, it can also be configured to return other formats like an ARN arn:aws:s3:::
The output from the handler after middy-store will contain the reference to the stored payload:
/* Output with reference */ { "@middy-store": "s3://bucket/key" }
middy-store embeds the reference from the Store in the output as an object with a key "@middy-store". This allows middy-store to quickly find all references when the next Lambda function is called and load the payloads from the Store before the handler runs. In case you are wondering, middy-store recursively iterates through the input object and searches for the "@middy-store" key. That means the input can contain multiple references, even from different Stores, and middy-store will find and load them.
By default, middy-store will store the entire output of the handler as a payload in the Store. However, you can also select only a part of the output to be stored. This is useful for workflows like AWS Step Functions, where you might need some of the data for control flow, e.g., a Choice state.
middy-store accepts a selector in its storingOptions config. The selector is a string path to the relevant value in the output that should be stored.
Here's an example:
const output = { a: { b: ['foo', 'bar', 'baz'], }, }; export const handler = middy() .use( middyStore({ stores: [new S3Store({ /* S3 options */ })], storingOptions: { selector: '', /* select the entire output as payload */ // selector: 'a'; /* selects the payload at the path 'a' */ // selector: 'a.b'; /* selects the payload at the path 'a.b' */ // selector: 'a.b[0]'; /* selects the payload at the path 'a.b[0]' */ // selector: 'a.b[*]'; /* selects the payloads at the paths 'a.b[0]', 'a.b[1]', 'a.b[2]', etc. */ } }) ) .handler(async () => output); await handler({});
The default selector is an empty string (or undefined), which selects the entire output as a payload. In this case, middy-store will return an object with only one property, which is the reference to the stored payload.
/* selector: '' */ { "@middy-store": "s3://bucket/key" }
The selectors a, a.b, or a.b[0] select the value at the path and store only this part in the Store. The reference to the stored payload will be inserted at the path in the output, thereby replacing the original value.
/* selector: 'a' */ { a: { "@middy-store": "s3://bucket/key" } } /* selector: 'a.b' */ { a: { b: { "@middy-store": "s3://bucket/key" } } } /* selector: 'a.b[0]' */ { a: { b: [ { "@middy-store": "s3://bucket/key" }, 'bar', 'baz' ] } }
A selector ending with [*] like a.b[*] acts like an iterator. It will select the array at a.b and store each element in the array in the Store separately. Each element will be replaced with the reference to the stored payload.
/* selector: 'a.b[*]' */ { a: { b: [ { "@middy-store": "s3://bucket/key" }, { "@middy-store": "s3://bucket/key" }, { "@middy-store": "s3://bucket/key" } ] } }
middy-store will calculate the size of the entire output returned from the handler. The size is calculated by stringifying the output, if it's not already a string, and calculating the UTF-8 encoded size of the string in bytes. It will then compare this size to the configured minSize in the storingOptions config. If the output size is equal to or greater than the minSize, it will store the output or a part of it in the Store.
export const handler = middy() .use( middyStore({ stores: [new S3Store({ /* S3 options */ })], storingOptions: { minSize: Sizes.STEP_FUNCTIONS, /* 256KB */ // minSize: Sizes.LAMBDA_SYNC, /* 6MB */ // minSize: Sizes.LAMBDA_ASYNC, /* 256KB */ // minSize: 1024 * 1024, /* 1MB */ // minSize: Sizes.ZERO, /* 0 */ // minSize: Sizes.INFINITY, /* Infinity */ // minSize: Sizes.kb(512), /* 512KB */ // minSize: Sizes.mb(1), /* 1MB */ } }) ) .handler(async () => output); await handler({});
middy-store provides a Sizes helper with some predefined limits for Lambda and Step Functions. If minSize is not specified, it will use Sizes.STEP_FUNCTIONS with 256KB as the default minimum size. The Sizes.ZERO (equal to the number 0) means that middy-store will always store the payload in a Store, ignoring the actual output size. On the other hand, Sizes.INFINITY (equal to Math.POSITIVE_INFINITY) means that it will never store the payload in a Store.
Currently, there is only one Store implementation for Amazon S3, but I'm planning to implement a Store backed by DynamoDB and DAX. DynamoDB, with its Time-To-Live (TTL) feature, provides a great option for short-term payloads that only need to exist during the execution of a workflow like Step Functions.
The middy-store-s3 package provides a store implementation for Amazon S3. It uses the official @aws-sdk/client-s3 package to interact with S3.
import { middyStore } from 'middy-store'; import { S3Store } from 'middy-store-s3'; const handler = middy() .use( middyStore({ stores: [ new S3Store({ config: { region: "us-east-1" }, bucket: "bucket", key: () => randomUUID(), format: "arn", }), ], }), ) .handler(async (input) => { return { /* ... */ }; });
The S3Store only requires a bucket where the payloads are being stored. The key is optional and defaults to randomUUID(). The format configures the style of the reference that is returned after a payload is stored. The supported formats include arn, object, or one of the URL formats from the amazon-s3-url package. It's important to note that S3Store can load any of these formats; the format config only concerns the returned reference. The config is the S3 client configuration and is optional. If not set, the S3 client will resolve the config (credentials, region, etc.) from the environment or file system.
A new Store can be implemented as a class or a plain object, as long as it provides the required functions from the StoreInterface interface.
Here's an example of a Store to store and load payloads as base64 encoded data URLs:
import { StoreInterface, middyStore } from 'middy-store'; const base64Store: StoreInterface<string, string> = { name: "base64", /* Reference must be a string starting with "data:text/plain;base64," */ canLoad: ({ reference }) => { return ( typeof reference === "string" && reference.startsWith("data:text/plain;base64,") ); }, /* Decode base64 string and parse into object */ load: async ({ reference }) => { const base64 = reference.replace("data:text/plain;base64,", ""); return Buffer.from(base64, "base64").toString(); }, /* Payload must be a string or an object */ canStore: ({ payload }) => { return typeof payload === "string" || typeof payload === "object"; }, /* Stringify object and encode as base64 string */ store: async ({ payload }) => { const base64 = Buffer.from(JSON.stringify(payload)).toString("base64"); return `data:text/plain;base64,${base64}`; }, }; const handler = middy() .use( middyStore({ stores: [base64Store], storingOptions: { minSize: Sizes.ZERO, /* Always store the data */ } }), ) .handler(async (input) => { /* Random text with 100 words */ return `Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.`; }); const output = await handler(null, context); /* Prints: { '@middy-store': 'data:text/plain;base64,IkxvcmVtIGlwc3VtIGRvbG9yIHNpdC...' } */ console.log(output);
This example is the perfect way to try middy-store, because it doesn't rely on external resources like an S3 bucket. You will find it in the repository at examples/custom-store and should be able to run it locally.
I've been tinkering with the API design for a while, and it's definitely not stable yet. I would love to get feedback on the current state as well as suggestions for changes or improvements. If you are eager to contribute to this project, please go ahead and submit feature requests or pull requests.
middy-store is a middleware for Lambda that automatically stores and loads payloads from and to a Store like Amazon S3 or potentially other services.
You will need @middy/core >= v5 to use middy-store Please be aware that the API is not stable yet and might change in the future. To avoid accidental breaking changes, please pin the version of middy-store and its sub-packages in your package.json to an exact version.
npm install --save-exact @middy/core middy-store middy-store-s3
AWS services have certain limits that one must be aware of. For example, AWS Lambda has a payload limit of 6MB for synchronous invocations and 256KB for asynchronous invocations. AWS Step Functions allows for a maximum input or output size of 256KB of data as a UTF-8 encoded string. If you exceed this limit when returning data, you will encounter the infamous States.DataLimitExceeded exception.
The usual workaround for this…
Atas ialah kandungan terperinci Middleware untuk Fungsi Langkah: Simpan dan Muatkan Muatan Secara Automatik daripada Amazon S3. Untuk maklumat lanjut, sila ikut artikel berkaitan lain di laman web China PHP!