OpenCV：尋找阿拉伯期刊中的專欄（Python）-Python教學-PHP中文網

正確答案

首頁

後端開發

Python教學

OpenCV：尋找阿拉伯期刊中的專欄（Python）

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Feb 22, 2024 pm 12:52 PM

overflow

OpenCV：尋找阿拉伯期刊中的專欄（Python）

問題內容

我是 opencv 新手，也是 python 新手。我嘗試將在網路上找到的程式碼拼接在一起來解決我的研究問題。我有一本 1870 年的阿拉伯語日記，有數百頁，每頁都包含兩欄，並有粗黑邊框。我想將兩列提取為圖像文件，以便分別對它們運行 ocr，同時忽略頁眉和頁腳。下面是一個頁面範例：

第 3 頁

我有十頁原始列印作為單獨的 png 檔案。我編寫了以下腳本來處理每一個。它在 10 頁中的 2 頁中按預期工作，但無法在其他 8 頁中產生列。我對所有函數的理解不夠深入，無法知道我可以在哪裡使用這些值，或者我的整個方法是否被誤導了 -我認為最好的學習方法是詢問社區您將如何解決這個問題。

import cv2

def cutpage(fname, pnum):
    image = cv2.imread(fname)
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    blur = cv2.GaussianBlur(gray, (7,7), 0)
    thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 13))
    dilate = cv2.dilate(thresh, kernel, iterations=1)
    dilatename = "temp/dilate" + str(pnum) + ".png"
    cv2.imwrite(dilatename, dilate)
    cnts = cv2.findContours(dilate, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]
    cnts = sorted(cnts, key=lambda x: cv2.boundingRect(x)[0])

    fullpage=1
    column=1
    for c in cnts:
        x, y, w, h = cv2.boundingRect(c)
        if h > 300 and w > 20:
            if (h/w)<2.5:
                print("Found full page: ", x, y, w, h)
                filename = "temp/p" + str(pnum) + "-full" + str(fullpage) + ".png"
                fullpage+=1
            else:
                print("Found column: ", x, y, w, h)
                filename = "temp/p" + str(pnum) + "-col" + str(column) + ".png"
                column+=1
            roi = image[y:y+h, x:x+w]
            cv2.imwrite(filename, roi)
    return (column-1)
        
for nr in range(10):
    filename = "p"+str(nr)+".png"
    print("Checking page", nr)
    diditwork = cutpage(filename, nr)
    print("Found", diditwork, "columns")

登入後複製

按照教程，我創建了一個模糊和擴張的二元反轉，以便它可以透過大的白色區域來識別不同的矩形區域。我還保存了每個擴展版本的副本，以便我可以看到它的樣子，這是處理後的上面的頁面：

第 3 頁已放大

“for c in cnts”循環應該會找到圖像中的大矩形區域。如果高寬比小於2.5，我會得到一個完整的頁面（沒有頁眉和頁腳，這效果很好），如果高寬比大於這個，我知道它是一個列，並且它保存了這個例如temp/ p2-col2.png

我得到了一些漂亮的完整頁面，沒有頁眉和頁腳，也就是說，只有較大的黑色邊框，但沒有被切成列。在 10 頁中的 2 頁中，我得到了我想要的內容，即：

第 2 頁的成功欄位

由於我有時會得到所需的結果，因此一定有某些東西正在起作用，但我不知道如何進一步改進它。

編輯：

以下是更多頁面範例：

正確答案

我嘗試了一些沒有任何擴張的東西，因為我想看看是否可以只使用中間線作為“分隔符號” 。這是程式碼：

im = cv2.cvtcolor(cv2.imread("arabic.png"), cv2.color_bgr2rgb) # read im as rgb for better plots
gray = cv2.cvtcolor(im, cv2.color_rgb2gray) # convert to gray
_, threshold = cv2.threshold(gray, 250, 255, cv2.thresh_binary_inv) # inverse thresholding
contours, _ = cv2.findcontours(threshold, cv2.retr_external, cv2.chain_approx_none) # find contours
sortedcontours = sorted(contours, key = cv2.contourarea, reverse=true) # sort according to area, descending
bigbox = sortedcontours[0] # get the contour of the big box
middleline = sortedcontours[1] # get the contour of the vertical line
xmiddleline, _, _, _ = cv2.boundingrect(middleline) # get x coordinate of middleline
leftboxcontour = np.array([point for point in bigbox if point[0, 0] < xmiddleline]) # assign left of line as points from the big contour
rightboxcontour = np.array([point for point in bigbox if point[0, 0] >= xmiddleline]) # assigh right of line as points from the big contour
leftboxx, leftboxy, leftboxw, leftboxh = cv2.boundingrect(leftboxcontour) # get properties of box on left
rightboxx, rightboxy, rightboxw, rightboxh = cv2.boundingrect(rightboxcontour) # get properties of box on right
leftboxcrop = im[leftboxy:leftboxy + leftboxh, leftboxx:leftboxx + leftboxw] # crop left 
rightboxcrop = im[rightboxy:rightboxy + rightboxh, rightboxx:rightboxx + rightboxw] # crop right
# maybe do you assertations about aspect ratio??
cv2.imwrite("right.png", rightboxcrop) # save image
cv2.imwrite("left.png", leftboxcrop) # save image

登入後複製

我沒有使用任何有關寬高比的斷言，所以也許這仍然是您需要做的事情..

基本上，這種方法中最重要的線條是基於 x 座標產生左輪廓和右輪廓。這是我得到的最終結果：

邊緣仍然有一些黑色部分，但對於 ocr 來說這應該不是問題。

僅供參考：我在 jupyter 中使用以下軟體包：

import cv2
import numpy as np
%matplotlib notebook
import matplotlib.pyplot as plt

登入後複製

v2.0：僅使用大框偵測來實現：

所以我做了一些擴張，這個大盒子很容易被偵測到。我使用水平內核來確保大盒子的垂直線始終足夠粗以被檢測到。然而，我無法解決中間線的問題，因為它非常細......儘管如此，這裡是上述方法的程式碼：

im = cv2.cvtcolor(cv2.imread("1.png"), cv2.color_bgr2rgb) # read im as rgb for better plots
gray = cv2.cvtcolor(im, cv2.color_rgb2gray) # convert to gray
gray[gray<255] = 0 # added some contrast to make it either completly black or white
_, threshold = cv2.threshold(gray, 250, 255, cv2.thresh_binary_inv) # inverse thresholding
thresholddilated = cv2.dilate(threshold, np.ones((1,10)), iterations = 1) # dilate horizontally
contours, _ = cv2.findcontours(thresholddilated, cv2.retr_external, cv2.chain_approx_none) # find contours
sortedcontours = sorted(contours, key = cv2.contourarea, reverse=true) # sort according to area, descending
x, y, w, h = cv2.boundingrect(sortedcontours[0]) # get the bounding rect properties of the contour
left = im[y:y+h, x:x+int(w/2)+10].copy() # generate left, i included 10 pix from the right just in case
right = im[y:y+h, int(w/2)-10:w].copy() # and right, i included 10 pix from the left just in case
fig, ax = plt.subplots(nrows = 2, ncols = 3) # plotting...
ax[0,0].axis("off")
ax[0,1].imshow(im)
ax[0,1].axis("off")
ax[0,2].axis("off")
ax[1,0].imshow(left)
ax[1,0].axis("off")
ax[1,1].axis("off")
ax[1,2].imshow(right)
ax[1,2].axis("off")

登入後複製

這些是結果，您可以注意到它並不完美，但同樣，由於您的目標是 ocr，這應該不是問題。

請告訴我這是否可以，如果不行，我會絞盡腦汁尋找更好的解決方案...

v3.0：一種獲得更直影像的更好方法，這將提高 ocr 的品質。

受到我在這裡的另一個答案的啟發：answer。拉直圖像是有意義的，這樣 ocr 就有更好的結果。因此，我在檢測到的外框上使用了四點變換。這將使圖像稍微變直，並使文字更加水平。這是程式碼：

im = cv2.cvtcolor(cv2.imread("2.png"), cv2.color_bgr2rgb) # read im as rgb for better plots
gray = cv2.cvtcolor(im, cv2.color_rgb2gray) # convert to gray
gray[gray<255] = 0 # added some contrast to make it either completly black or white
_, threshold = cv2.threshold(gray, 250, 255, cv2.thresh_binary_inv) # inverse thresholding
thresholddilated = cv2.dilate(threshold, np.ones((1,10)), iterations = 1) # dilate horizontally
contours, _ = cv2.findcontours(thresholddilated, cv2.retr_external, cv2.chain_approx_none) # find contours
largest_contour = max(contours, key = cv2.contourarea) # get largest contour
hull = cv2.convexhull(largest_contour) # get the hull
epsilon = 0.02 * cv2.arclength(largest_contour, true) # epsilon
pts1 = np.float32(cv2.approxpolydp(hull, epsilon, true).reshape(-1, 2)) # get the points
result = four_point_transform(im, pts1) # using imutils
height, width = result.shape[:2] # get the dimensions of the transformed image
left = result[:, 0:int(width/2)].copy() # from the beginning to half the width
right = result[:, int(width/2): width].copy() # from half the width till the end
fig, ax = plt.subplots(nrows = 2, ncols = 3) # plotting...
ax[0,0].axis("off")
ax[0,1].imshow(result)
ax[0,1].axvline(width/2)
ax[0,1].axis("off")
ax[0,2].axis("off")
ax[1,0].imshow(left)
ax[1,0].axis("off")
ax[1,1].axis("off")
ax[1,2].imshow(right)
ax[1,2].axis("off")

登入後複製

具有以下軟體包：

import cv2
import numpy as np
%matplotlib notebook
import matplotlib.pyplot as plt
from imutils.perspective import four_point_transform

登入後複製

正如您從程式碼中看到的，這是一種更好的方法，由於四點變換，您可以強制圖像居中且水平。此外，不需要包含一些重疊，因為影像分離得很好。這是一個供您參考的範例：

以上是OpenCV：尋找阿拉伯期刊中的專欄（Python）的詳細內容。更多資訊請關注PHP中文網其他相關文章！

本網站聲明

本文內容由網友自願投稿，版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容，請聯絡admin@php.cn

熱AI工具

熱工具

OpenCV：尋找阿拉伯期刊中的專欄（Python）

正確答案

熱AI工具

Undresser.AI Undress

AI Clothes Remover

Undress AI Tool

Clothoff.io

Video Face Swap

熱門文章

熱工具

記事本++7.3.1

SublimeText3漢化版

禪工作室 13.0.1

Dreamweaver CS6

SublimeText3 Mac版

熱門話題