Android 용 Mobile Vision에서 전체 텍스트 줄을 읽도록하는 방법

자습서를 따라 Android 용 Google 모바일 비전을 구현했습니다. 영수증을 스캔하고 숫자 합계를 찾을 앱을 만들려고합니다. 그러나 다른 형식으로 인쇄 된 다른 영수증을 스캔하면 API가 TextBlocks를 임의의 방식으로 감지합니다. 예를 들어, 하나의 영수증에서 여러 단어의 텍스트가 단일 공백으로 분리 된 경우 단일 텍스트 블록으로 그룹화됩니다. 그러나 두 단어의 텍스트가 많은 공백으로 구분 된 경우 동일한 "줄"에 표시 되더라도 독립적 인 TextBlock으로 구분됩니다. 내가하려는 일은 API가 영수증의 각 행 전체를 단일 항목으로 인식하도록 강제하는 것입니다. 이것이 가능한가?Android 용 Mobile Vision에서 전체 텍스트 줄을 읽도록하는 방법

출처

2017-02-21 gig6

아직 해결 방법을 찾으셨습니까? 그렇다면 실시간으로 카메라 앱을 사용하는 것과 달리 기존 이미지를 감지 할 수 있었습니까? – DaveNOTDavid

public ArrayList<T> getAllGraphicsInRow(float rawY) { 
    synchronized (mLock) { 
     ArrayList<T> row = new ArrayList<>(); 
     // Get the position of this View so the raw location can be offset relative to the view. 
     int[] location = new int[2]; 
     this.getLocationOnScreen(location); 
     for (T graphic : mGraphics) { 
      float rawX = this.getWidth(); 
      for (int i=0; i<rawX; i+=10){ 
       if (graphic.contains(i - location[0], rawY - location[1])) { 
        if(!row.contains(graphic)) { 
         row.add(graphic); 
        } 
       } 
      } 
     } 
     return row; 
    } 
}

이 파일은 GraphicOverlay.java 파일에 있어야하며 기본적으로 해당 행의 모든 그래픽을 가져옵니다.

public static boolean almostEqual(double a, double b, double eps){ 
    return Math.abs(a-b)<(eps); 
} 

public static boolean pointAlmostEqual(Point a, Point b){ 
    return almostEqual(a.y,b.y,10); 
} 
public static boolean cornerPointAlmostEqual(Point[] rect1, Point[] rect2){ 
    boolean almostEqual=true; 
    for (int i=0; i<rect1.length;i++){ 
      if (!pointAlmostEqual(rect1[i],rect2[i])){ 
       almostEqual=false; 
      } 
     } 
    return almostEqual; 
} 
private boolean onTap(float rawX, float rawY) { 
    String priceRegex = "(\\d+[,.]\\d\\d)"; 
    ArrayList<OcrGraphic> graphics = mGraphicOverlay.getAllGraphicsInRow(rawY); 
    OcrGraphic currentGraphics = mGraphicOverlay.getGraphicAtLocation(rawX,rawY); 
    if (graphics !=null && currentGraphics!=null) { 
     List<? extends Text> currentComponents = currentGraphics.getTextBlock().getComponents(); 
     final Pattern pattern = Pattern.compile(priceRegex); 
     final Pattern pattern1 = Pattern.compile(priceRegex); 

     TextBlock text = null; 
     Log.i("text results", "This many in the row: " + Integer.toString(graphics.size())); 

     ArrayList<Text> combinedComponents = new ArrayList<>(); 
     for (OcrGraphic graphic : graphics) { 
      if (!graphic.equals(currentGraphics)) { 
       text = graphic.getTextBlock(); 
       Log.i("text results", text.getValue()); 
       combinedComponents.addAll(text.getComponents()); 
      } 
     } 

     for (Text currentText : currentComponents) { // goes through components in the row 
      final Matcher matcher = pattern.matcher(currentText.getValue()); // looks for 
      Point[] currentPoint = currentText.getCornerPoints(); 

      for (Text otherCurrentText : combinedComponents) {//Looks for other components that are in the same row 
       final Matcher otherMatcher = pattern1.matcher(otherCurrentText.getValue()); // looks for 
       Point[] innerCurrentPoint = otherCurrentText.getCornerPoints(); 

       if (cornerPointAlmostEqual(currentPoint, innerCurrentPoint)) { 
        if (matcher.find()) { // if you click on the price 
         Log.i("oh yes", "Item: " + otherCurrentText.getValue()); 
         Log.i("oh yes", "Value: " + matcher.group(1)); 
         itemList.add(otherCurrentText.getValue()); 
         priceList.add(Float.valueOf(matcher.group(1))); 
        } 
        if (otherMatcher.find()) { // if you click on the item 
         Log.i("oh yes", "Item: " + currentText.getValue()); 
         Log.i("oh yes", "Value: " + otherMatcher.group(1)); 
         itemList.add(currentText.getValue()); 
         priceList.add(Float.valueOf(otherMatcher.group(1))); 
        }      
        Toast toast = Toast.makeText(this, " Text Captured!" , Toast.LENGTH_SHORT); 
        toast.show(); 
       } 
      } 

     } 
     return true; 
    } 
    return false; 
}

이것은 OcrCaptureActivity.java에 있어야하고 그에 따라 모든 값을 라인에있는 TextBlock을 중단하고 구성 요소가 모두 가격이있는 경우 라인 및 검사와 같은 행에 블록을 발견하고 인쇄합니다.

almostEqual의 eps 값은 행의 그래픽을 검사하는 높이의 허용치입니다.

출처

2017-12-08 20:02:53 bhuang

이 기능은 텍스트 인식 API의 클래스 인 CameraSourcePreview 및 GraphicOverlay를 사용해야하므로 기존 이미지와 달리 실시간으로 카메라 앱을 사용하는 경우에만 작동합니다. – DaveNOTDavid

Android 용 Mobile Vision에서 전체 텍스트 줄을 읽도록하는 방법

답변

관련 문제