[Unity] Unity+OpenCV 문서 인식 하기

OpenCV

[Unity] Unity+OpenCV 문서 인식 하기

Dean83 2022. 4. 1. 14:43

PaperScanner 스크립트를 이용하여 문서 스캔.

예제 씬 : DocumentScannerScene

1. 간략 요약

- PaperScanner 맴버 변수 생성

- 스캔 옵션 설정

- texture를 mat으로 변환하여 PaperScanner 의 mat 변수에 배정

- PaperScanner 의 Success 여부를 판단하기위해 변수 호출시 내부함수 CalculateOutput가 실행되며

스캔 작업을 진행함
- 흑백변환, 노이즈 감소, 블러처리, edge 검출, 윤곽선 검출 등
- 검출된 영역중 문서 영역을 추출하여 결과 출력

- Success 일때 결과물을 다시 texture로 변환하여 화면에 표시

2. 세부 설명 - 옵션값

- PaperScanner 옵션값

- GrayMode : grayscale과 HueGrayscale이 있음. RGB흑백변환, HSV 흑백변환

- ColorThreshold : 흑백변환시 적절한 threshold (임계점)를 찾기위한 알고리즘 선택. 기본은 adaptive

- Adaptive : 이미지를 작은 영역으로 나누고, 해당 영역에 적용되는 임계값을 찾는다.
- MeanC와 Gaussian_C 로 세분화 되나, Document 에서는 MeanC로만 작업한다.

- Otsu : 모든 임계값에 대해 계산하므로 속도가 느리고, 노이즈가 많은 이미지는 결과가 좋지 못하다.
따라서 가우시안 필터를 같이 적용하여 노이즈를 줄인 후 해당 알고리즘을 쓰는게 좋다.
- Otsu 알고리즘 동작방식 :

- 임의의 경계값을 정한 후, 픽셀을 흑, 백으로 구분하고 해당 분포도를 확인한다. 임계값을 변경하여

분포도를 확인하는것을 반복한다. 그 후 가장 균일한 분포도를 갖는 경계값을 선택한다.
임의의 경계값은, 이미지의 히스토그램을 분석하여 중간값을 적용한다.

- Decolorization : 문서 스캔은 흑백이 가장 명확함. 흑백 설정을 할지 아닐지 설정 값

- Always : 항상

- Never : 흑백화 안함

- Automatic : 알고리즘이 자동으로 판단.

- ExpectedArea : 전체 사진중, 문서로 판단되는 영역의 %값. 기본값은 33%

- Scale : 전처리 스케일. 512가 기본값이고 0은 끄는것임.

- NoiseReduction : 0 ~ 1사이 값 설정. 노이즈 감소 옵션값으로 기본값은 0.33

- EdgesTight : 0 ~ 1사이값 설정. 기본값은 0.75 가장자리를 감지하는 Canny 알고리즘의 임계값

lower과 upper를 찾아내기 위한 설정값. 값이 낮을수록 가장자리 탐지범위가 좁고,

값이 클수록 탐지범위가 넒어진다.
Gray로 변환된 이미지에서 히스토그램 검출 후 median 함수를 통해 중간값을 검출, 이 값을
EdgesTight 설정값과 곱하여 Canny 알고리즘의 lower, upper 값으로 활용한다.
Canny 알고리즘은 경계선 검출에서 가장 많이 쓰이는 알고리즘이다.

- DropBadGuess : true/false 값이며 true가 기본값. 불충분한 정보로 추측한 shape를 버릴지 말지 설정.

예상값보다 너무 차이가 나는 경우 버릴지 말지 판단하는 옵션값.

3. 세부 설명 - 스캔작업

- 아래의 옵션값을 설정하고, 그외에는 기본값을 따랐음

- NoiseReduction : 0.7

- EdgesTight : 0.9

- ExpectedArea : 0.2

- GrayMode = Grayscale
- texture를 mat로 변환하고, Success 여부를 물어보는 동시에 작업시작

scanner.Input = Unity.TextureToMat(inputTexture);

//실패시 HueGrayscale로 변경하여 재시도 해야함. 
if (!scanner.Success)
    scanner.Settings.GrayMode = PaperScanner.ScannerSettings.ColorMode.HueGrayscale;

- 실제 처리함수 (CalculateOutput) - 이미지를 흑백변환 (엄밀히 말하면 흑백이라기보다 색깔을 뺐다고 봐야함)

if (Settings.GrayMode == ScannerSettings.ColorMode.HueGrayscale)
{
    var matHSV = matInput_.CvtColor(ColorConversionCodes.RGB2HSV);
    Mat[] hsvChannels = matHSV.Split();
    matGray = hsvChannels[0];
}
else
{
    matGray = matInput_.CvtColor(ColorConversionCodes.BGR2GRAY);
}

- 실제 처리함수 (CalculateOutput) - 이미지 스케일 변경

- 이미지가 크다면 크기를 down 시킨다.

float sx = 1, sy = 1;
if (Settings.Scale != 0)
{
    if (matGray.Width > Settings.Scale)
        sx = (float)Settings.Scale / matGray.Width;
    if (matGray.Height > Settings.Scale)
        sy = (float)Settings.Scale / matGray.Height;

    matScaled = matGray.Resize(new Size(Math.Min(matGray.Width, Settings.Scale), Math.Min(matGray.Height, Settings.Scale)));
}

- 실제 처리함수 (CalculateOutput) - 노이즈 감소 작업 (Median Blur)

- Median Blur를 이용하여 노이즈를 감소 시킨다. 사진의 하얀 점들, 검은 점들 같은 노이즈 제거에 좋다.
- Median Blur는 커널의 픽셀값 중 중간 값을 선택하여 블러링 작업을 한다.
- 즉, 작업대상 픽셀의 주변부 픽셀을 커널 크기만큼 가져와서, 그 중간값을 선택한다.
- 커널은 필터와 같은말이며, 이 필터값을 소스픽셀이 통과하면서 계산된 결과값이 나오게 된다. (Convolution)
- Median 같은 경우는, 필터의 중간값으로 그대로 대치를 한다.
- 커널 값이 클수록 (행렬의 크기가 커질수록) 블러 효과가 더 강해지고, 값이 작을수록 블러 효과가 떨어진다.
- Median은 커널의 중간값으로 대치를 하므로, 작업대상 픽셀의 주변부 값이 많을수록 중간값이 크게 변화하고

주변부 값이 적을수록 중간값이 작게 변화할 수 밖에 없다.
- blur 연산시 일반적으로 커널크기는 홀수를 사용한다고 한다.
- 솔직히 아래 코드에서 커널크기를 구하는 코드는 이해가 가지 않는다. 굳이 저런 계산을 하는 이유가...?

if (Settings.NoiseReduction != 0)
{
    int medianKernel = 11;

    // calculate kernel scale
    double kernelScale = Settings.NoiseReduction;
    if (0 == Settings.Scale)
        kernelScale *= Math.Max(matInput_.Width, matInput_.Height) / 512.0;

    // apply scale
    medianKernel = (int)(medianKernel * kernelScale + 0.5);
    medianKernel = medianKernel - (medianKernel % 2) + 1;

    if (medianKernel > 1)
        matBlur = matScaled.MedianBlur(medianKernel);
}

- 실제 처리함수 (CalculateOutput) - edge 감별

- 세팅값 중 EdgesTight 값에 따라 Canny 알고리즘에 사용할 적절한 low, upper 값을 찾고, Canny 알고리즘을

돌린다.

- sigma 가 EdgesTight 값이다.

int upper, lower;
CalculateThresholdBounds(matGray, out lower, out upper, sigma);
return matGray.Canny(lower, upper, 3, true);

- 실제 처리함수 (CalculateOutput) - 경계선 감별 (Cv2.FindContours) 을 통한 문서 후보 항목 구별
- https://dean83.tistory.com/67 의 경계선 찾기 부분 참조.
- 감별된 경계선들을 반복문으로 돌면서 문서가 될만한 경계선만 골라낸다.
- 경계선의 길이가 25 미만인것들은 버린다.
- Cv2.ApproxPolyDP를 이용하여 윤곽선 그리는데 필요한 포인트 갯수를 구해낸다
https://dean83.tistory.com/67 경계선 찾기 부분 참조.
- 해당 부분을 List에 담아둔다.
- 윤곽선의 포인트 수가 4개 ~ 6개 사이일 경우 해당면적을 구한다 (Cv2.ContourArea)
- 구한 면적크기 / 이미지 크기가 세팅값의 ExpectedArea 보다 클경우 (여기서는 0.2로 설정) 문서일 가능성이
높으므로 List 변수에 따로 담아둔다.

Point[][] contours;
HierarchyIndex[] hierarchy;
Cv2.FindContours(matEdges, out contours, out hierarchy, RetrievalModes.List, ContourApproximationModes.ApproxNone, null);

// check contours and drop those we consider "noise", all others put into a single huge "key points" map
// also, detect all almost-rectangular contours with big area and try to determine whether they're exact match
List<Point> keyPoints = new List<Point>();
List<Point[]> goodCandidates = new List<Point[]>();
double referenceArea = matScaled.Width * matScaled.Height;
foreach (Point[] contour in contours)
{
    double length = Cv2.ArcLength(contour, true);

    // drop mini-contours
    if (length >= 25.0)
    {
        Point[] approx = Cv2.ApproxPolyDP(contour, length * 0.01, true);
        keyPoints.AddRange(approx);

        if (approx.Length >= 4 && approx.Length <= 6)
        {
            double area = Cv2.ContourArea(approx);
            if (area / referenceArea >= Settings.ExpectedArea)
                goodCandidates.Add(approx);
        }
    }
}

- 실제 처리함수 (CalculateOutput) - Convex hull
- Convex hull 을 이용하여 위에서 따로 리스트에 보관한 keypoint 들을 면으로 묶는다
- 해당 면을 그리기 위한 최소 포인트 개수를 뽑아낸다. (Cv2.ApproxPolyDP)
- 위에서 일정 크기를 갖는 포인트들을 가지고 문서 외형을 뽑아낸다. (GetBestMatchingContour)

* Convex hull
- 모든점을 내포하는 면을 만드는 알고리즘 이다. 모든점이 중구난방으로 퍼져있다면, 가장 바깥쪽

좌표들만 이어서 큰 면을 만들어 그 안에 작은점이 포함되도록 한다.

Point[] hull = Cv2.ConvexHull(keyPoints);
Point[] hullContour = Cv2.ApproxPolyDP(hull, Cv2.ArcLength(hull, true) * 0.01, true);

// find best guess for our contour
Point[] paperContour = GetBestMatchingContour(matScaled.Width * matScaled.Height, goodCandidates, hullContour);
if (null == paperContour)
{
    shape_ = null;
    dirty_ = false;
    matOutput_ = matInput_;
    return;
}

- 실제 처리함수 (CalculateOutput) - GetBestMatchingContour
- 위에 CV2.findcontour 에서 너비가 25% 이상이고, 좌표 개수가 4개 ~ 6개 사이인 후보지들을 한데 합쳐놓고

이를 포괄하는 면을 생성한다 (convex hull)
- 위에서 구한 후보지들이 문서가 아닐수도 있다. 문서내 이미지들 일 수도 있기 때문에 모든 점을 모아놓고

이를 포괄하는 면을 구하는 것이다.
- 문서로 추정되는 면을 구하였으면, 옵션값에 따라 만일 해당 크기가 75% 미만일경우 버린다.

Point[] result = hull;
if (candidates.Count == 1)
    result = candidates[0];
else if (candidates.Count > 1)
{
    List<Point> keys = new List<Point>();
    foreach (var c in candidates)
        keys.AddRange(c);

    Point[] joinedCandidates = Cv2.ConvexHull(keys);
    Point[] joinedHull = Cv2.ApproxPolyDP(joinedCandidates, Cv2.ArcLength(joinedCandidates, true) * 0.01, true);
    result = joinedHull;
}

// check further
if (Settings.DropBadGuess)
{
    double area = Cv2.ContourArea(result);
    if (area / areaSize < Settings.ExpectedArea * 0.75)
        result = null;
}

return result;

- 실제 처리함수 (CalculateOutput) - Convex hull 이어서.
- 위에서 문서로 예상되는 면을 가져왔다면, 포인트 갯수를 통해 작업을 이어서 한다.

- 만일 4개라면, 사각형을 의미하므로 문서로 간주를 하고 코너 정렬을 한다 (좌상, 우상, 우하, 좌하 순)
- 만일 2개보다 크다면, 종이가 접히거나 하여서 4개 이상의 포인트를 갖고 있는것일 수 있다.
- 포인트가 4개는 아니고 2개보다 클경우, 해당 좌표를 감싸는 최소한의 사각형 좌표를 구한다 (Cv2.MinAreaRect)
- MinAreaRect의 좌표를 이용하여 코너 좌표 정렬을 한다. (좌상, 우상, 우하, 좌하 순)
- 구한 4개의 좌표에서, 가장 가까운 실제 측정한 좌표값을 찾는다. (GetBestMatchingContour에서 추출한 좌표들)
- 그렇다면, 사각형의 형태를 벗어날 순 있지만, 각 4코너에 가장 인접한 4개의 실측 좌표값을 갖게된다.
- 만일 scale이 변경되었다면, 구한 좌표의 scale을 적용한다.

* Cv2.MinAreaRect는 해당 좌표를 감싸는 최소한의 사각형이다. 마름모 꼴이 될 수도 있다.

리턴값은 각 4변의 좌표, width, height와 각도가 포함된다. 원래는 우상단 좌표, width, height와 각도만 주는데
Asset에서는 4변의 좌표를 준다.

// exact hit - we have 4 corners
if (paperContour.Length == 4)
{
    paperContour = SortCorners(paperContour);
}
// some hit: we either have 3 points or > 4 which we can try to make a 4-corner shape
else if (paperContour.Length > 2)
{
    // yet contour might contain too much points: along with calculation inaccuracies we might face a
    // bended piece of paper, missing corner etc.
    // the solution is to use bounding box
    RotatedRect bounds = Cv2.MinAreaRect(paperContour);
    Point2f[] points = bounds.Points();
    Point[] intPoints = Array.ConvertAll(points, p => new Point(Math.Round(p.X), Math.Round(p.Y)));
    Point[] fourCorners = SortCorners(intPoints);

    // array.ClosestElement is not efficient but we can live with it since it's quite few
    // elements to search for
    System.Func<Point, Point, double> distance = (Point x, Point y) => Point.Distance(x, y);
    Point[] closest = new Point[4];
    for (int i = 0; i < fourCorners.Length; ++i)
        closest[i] = paperContour.ClosestElement(fourCorners[i], distance);

    paperContour = closest;
}

// scale contour back to input image coordinate space - if necessary
if (sx != 1 || sy != 1)
{
    for (int i = 0; i < paperContour.Length; ++i)
    {
        Point2f pt = paperContour[i];
        paperContour[i] = new Point2f(pt.X / sx, pt.Y / sy);
    }
}

- 실제 처리함수 (CalculateOutput) - 결과 그리기
- 문서로 추정되는 좌표값을 구한 후, 해당 좌표가 4개라면 (사각형 형태라면) UnwrapShape 함수를 호출한다.

var matUnwrapped = matInput_;
bool needConvertionToBGR = true;
if (paperContour.Length == 4)
{
    matUnwrapped = matInput_.UnwrapShape(Array.ConvertAll(paperContour, p => new Point2f(p.X, p.Y)));

- 실제 처리함수 (CalculateOutput) - 결과 그리기 - UnwrapShape
- 위에 구한 문서로 추정되는 좌표값들을 이용해 width와 height 를 구한다.

float width = (float)Point2f.Distance(corners[0], corners[1]);  // lt -> rt
float height = (float)Point2f.Distance(corners[0], corners[3]); // lt -> lb

- 만일 사이즈가 이미지사이즈를 벗어나면, 축소 작업을 한다.

// downscaling
if (maxSize > 0 && (width > maxSize || height > maxSize))
{
    if (width > height)
    {
        var s = maxSize / width;
        width = maxSize;
        height = height * s;
    }
    else
    {
        var s = maxSize / height;
        height = maxSize;
        width = width * s;
    }
}

- width 와 height를 통해 원래 각 코너에서 사각형 형태로 변할 좌표값을 구하고, (GetPerspectiveTransform)
기울어진 문서를 정면으로 변경한다. (투시변환. WarpPerspective)
예를들어, 마름모 꼴이라면, 작은변에 해당하는 이미지를 늘려 정사각형 또는 직사각형 형태로 변환한다.
이미지를 upsampling 하거나 downsampling 하기위해 보간법 옵션이 적용된다.
* WarpPerspective의 보간법 옵션 (결과물의 차이는 잘 모르겠다)
- Nearest
- Linear
- Cubic
- Area
- Lanczos4
- Max
- WarpFillOutliers
- WarpInverseMap

// compute transform
//목표 좌표이다. 마름모 꼴이 아닌 사각형 형태를 갖는 좌표이다
Point2f[] destination = new Point2f[]
{
    new Point2f(0,     0),
    new Point2f(width, 0),
    new Point2f(width, height),
    new Point2f(0,     height)
};

//이 함수에서 마름모꼴 좌표 혹은 기울어진 문서의 좌표가 사각형이 되기위한 좌표값으로
//변환을 해준다
var transform = Cv2.GetPerspectiveTransform(corners, destination);

// un-warp
//위에서 변환받은 좌표값을 이용해 원본 이미지를 조절한다.
return img.WarpPerspective(transform, new Size(width, height), InterpolationFlags.Cubic);

- 실제 처리함수 (CalculateOutput) - 컬러변환 및 결과물 화면에 표시
- 세팅값에 따라 컬러변환이 필요한 경우 흑백으로 변환을 한다.
- Otsu 와 MeanC 알고리즘중 하나로 변환을 한다.

// automatic color converter
bool convertColor = (ScannerSettings.DecolorizationMode.Always == Settings.Decolorization);
if (ScannerSettings.DecolorizationMode.Automatic == Settings.Decolorization)
    convertColor = !IsColored(matUnwrapped);

// perform color conversion to b&w
if (convertColor)
{
    matUnwrapped = matUnwrapped.CvtColor(ColorConversionCodes.BGR2GRAY);

    // we have some constants for Adaptive, but this can be improved with some 'educated guess' for the constants depending on input image
    if (ScannerSettings.ScanType.Adaptive == Settings.ColorThreshold)
        matUnwrapped = matUnwrapped.AdaptiveThreshold(255, AdaptiveThresholdTypes.MeanC, ThresholdTypes.Binary, 47, 25);
    // Otsu doesn't need our help, decent on it's own
    else
        matUnwrapped = matUnwrapped.Threshold(0, 255, ThresholdTypes.Binary | ThresholdTypes.Otsu);
}
else
{
    needConvertionToBGR = false;
}


// assign result
shape_ = paperContour;

matOutput_ = matUnwrapped;
if (needConvertionToBGR)
    matOutput_ = matOutput_.CvtColor(ColorConversionCodes.GRAY2BGR);    // to make it compatible with input texture

// mark we're good
dirty_ = false;