How to use PHP and Alibaba Cloud OCR to extract table text?
Alibaba Cloud OCR (Optical Character Recognition) is a powerful text recognition technology that can be used to extract text information from pictures or scanned documents. As a popular server-side scripting language, PHP can interact with the Alibaba Cloud OCR API to implement table text extraction functions. This article will introduce in detail how to use PHP and Alibaba Cloud OCR to implement this function, and provide code examples.
First, you need to register an account on the Alibaba Cloud official website and activate the OCR service. Then, log in to the Alibaba Cloud console and obtain the Access Key ID and Access Key Secret on the OCR service page. This information will be used for subsequent API requests.
Alibaba Cloud officially provides PHP SDK, which you can install through Composer. Execute the following command in the command line:
composer require alibabacloud/sdk
Create a PHP file named "extract_table.php" and introduce Alibaba Cloud at the beginning of the file OCR SDK:
require 'vendor/autoload.php'; use AlibabaCloudClientAlibabaCloud; use AlibabaCloudClientExceptionClientException; use AlibabaCloudClientExceptionServerException;
Add the following code in the file to connect to Alibaba Cloud OCR API and authenticate:
AlibabaCloud::accessKeyClient('your_access_key_id', 'your_access_key_secret') ->regionId('your_region_id') // 例如:cn-shanghai ->asDefaultClient();
Please replace "your_access_key_id" and "your_access_key_secret" with the Access Key ID and Access Key Secret you obtained from the Alibaba Cloud console. At the same time, please replace "your_region_id" with the region ID of your region (for example: cn-shanghai).
Add the following code in the file to implement the table text extraction function:
try { $response = AlibabaCloud::ocr() ->v20191230() ->recognizeTable() ->withImageUrl('your_image_url') ->debug(true) // 可选:打开调试模式,便于定位问题 ->timeout(3) // 可选:设置请求超时时间(单位:秒) ->connectTimeout(3) // 可选:设置连接超时时间(单位:秒) ->request(); // 解析API返回结果 $result = json_decode($response->getBody(), true); $tables = $result['Data']['Tables']; // 输出提取到的文字 foreach ($tables as $table) { foreach ($table['Result']['TableCells'] as $cell) { echo $cell['Text']; } } } catch (ClientException $e) { // 处理客户端异常 echo $e->getErrorMessage(); } catch (ServerException $e) { // 处理服务端异常 echo $e->getErrorMessage(); }
Please replace "your_image_url" with your The URL of the image to be extracted.
Save and close the "extract_table.php" file, and then execute the following command on the command line to run the PHP file:
php extract_table.php
At this time, PHP will send a request to Alibaba Cloud OCR API, extract the text in the table, and output the result to the command line window.
After the above steps, you can use PHP and Alibaba Cloud OCR API to implement the table text extraction function. Depending on your actual needs, you can save the extracted text to a file or use it for subsequent data processing. Hope this article is helpful to you!
The above is the detailed content of How to use PHP and Alibaba Cloud OCR to extract table text?. For more information, please follow other related articles on the PHP Chinese website!