Python calls the Alibaba Cloud interface to implement the OCR text extraction function
Alibaba Cloud provides a series of powerful APIs, including the OCR (Optical Character Recognition) text recognition interface. Through this interface, we can identify text in pictures, which is very suitable for some text extraction scenarios, such as converting text in paper documents into electronic text.
This article will introduce how to call Alibaba Cloud's OCR interface in Python and implement the text extraction function. The following are the specific steps:
Step 1: Install Alibaba Cloud SDK
To call Alibaba Cloud's API interface, you first need to install the corresponding SDK. In Python, we can install Alibaba Cloud SDK through the pip command.
Open the terminal and enter the following command:
pip install aliyun-python-sdk-core pip install aliyun-python-sdk-ocr
Step 2: Obtain Access Key and Secret Key
To call Alibaba Cloud’s API, you need to provide Access Key and Secret Key . You can apply for and obtain these two key information on the Alibaba Cloud console. Make sure to keep both of these pieces of information in a safe place.
Step 3: Write code to call the OCR interface
First you need to import the relevant libraries:
import base64 import json import urllib import urllib.request from aliyunsdkcore import client from aliyunsdkocr.request.v20191230 import RecognizeCharacterRequest
Next, initialize the Alibaba Cloud client:
def create_aliyun_client(): access_key = "<Your Access Key>" secret_key = "<Your Secret Key>" region_id = "cn-hangzhou" return client.AcsClient(access_key, secret_key, region_id)
Then, write a function that calls the OCR interface:
def ocr_character(image_path): app_key = "<Your App Key>" request = RecognizeCharacterRequest.RecognizeCharacterRequest() request.set_accept_format('json') with open(image_path, 'rb') as file: image_data = file.read() base64_data = base64.b64encode(image_data) request.set_ImageURL(base64_data) response = create_aliyun_client().do_action_with_exception(request) result = json.loads(response) print(result)
In the above code, you need to replace the Access Key, Secret Key and App Key, and pass in the path of the image you want to identify.
Finally, call the ocr_character
function and pass in the path of the image that needs to be recognized.
if __name__ == "__main__": image_path = "<Your Image Path>" ocr_character(image_path)
Note that the local path of the image is used here. If you want to identify the image on the network, you need to use its URL. In addition, Alibaba Cloud's OCR interface currently supports limited image formats. Generally speaking, it is recommended to use images in JPEG or PNG format.
Summary:
This article introduces how to use Python to call Alibaba Cloud's OCR interface to implement the text extraction function. Through this interface, we can easily convert the text in the picture into electronic text, which improves work efficiency and simplifies some manual transcription work.
Hope this article is helpful to you!
The above is the detailed content of Python calls Alibaba Cloud interface to implement OCR text extraction function. For more information, please follow other related articles on the PHP Chinese website!