JPedal - PDF의 한 지점에서 단어 강조 표시

JPedal 라이브러리를 사용하여 PDF 문서에서 단어를 두 번 클릭하여 강조 표시 할 수있는 기능을 구현하고 싶습니다. 단어의 경계 사각형을 가져 와서 MouseEvent 위치가 그 내부에 있는지 확인할 수 있다면 이렇게하는 것이 쉽지 않을 것입니다.JPedal - PDF의 한 지점에서 단어 강조 표시

private void highlightText() { 
    Rectangle highlightRectangle = new Rectangle(firstPoint.x, firstPoint.y, 
      secondPoint.x - firstPoint.x, secondPoint.y - firstPoint.y); 
    pdfDecoder.getTextLines().addHighlights(new Rectangle[]{highlightRectangle}, false, currentPage); 
    pdfDecoder.repaint(); 
}

난 단지 그러나 문서에 일반 텍스트 추출 예를 찾을 수 있습니다 : 다음 코드는 영역을 강조하는 방법을 보여줍니다.

출처

2012-09-26 lotophage

Mark의 예를 살펴본 결과 나는 그 코드를 작동시킬 수있었습니다. 몇 가지 단점이 있으므로 다른 사람들에게 도움이 될 수 있도록 어떻게 작동하는지 설명하겠습니다. 핵심 방법은 extractTextAsWordlist이며, 추출 할 영역이 주어진 경우 {word1, w1_x1, w1_y1, w1_x2, w1_y2, word2, w2_x1, ...} 양식의 List<String>을 반환합니다. 단계별 지침은 다음과 같습니다.

첫째, 당신은 MouseEvent의 구성 요소를 변환 할 필요/화면 크기 조절을 위해 PDF 페이지 좌표와 정확한 좌표 :

/** 
* Transforms Component coordinates to page coordinates, correcting for 
* scaling and panning. 
* 
* @param x Component x-coordinate 
* @param y Component y-coordinate 
* @return Point on the PDF page 
*/ 
private Point getPageCoordinates(int x, int y) { 
    float scaling = pdfDecoder.getScaling(); 
    int x_offset = ((pdfDecoder.getWidth() - pdfDecoder.getPDFWidth())/2); 
    int y_offset = pdfDecoder.getPDFHeight(); 
    int correctedX = (int)((x - x_offset + viewportOffset.x)/scaling); 
    int correctedY = (int)((y_offset - (y + viewportOffset.y))/scaling); 
    return new Point(correctedX, correctedY); 
}

다음, 텍스트를 검색 할 수있는 상자를 만들 수 있습니다. 다음 나는 Rectangle의 시퀀스로이를 파싱

/** 
* Scans for all the words located with in a box the width of the page and 
* 40 points high, centered at the supplied point. 
* 
* @param p Point to centre the scan box around 
* @return A List of words within the scan box 
* @throws PdfException 
*/ 
private List<String> scanForWords(Point p) throws PdfException { 
    List<String> result = Collections.emptyList(); 
    if (pdfDecoder.getlastPageDecoded() > 0) { 
     PdfGroupingAlgorithms currentGrouping = pdfDecoder.getGroupingObject(); 
     PdfPageData currentPageData = pdfDecoder.getPdfPageData(); 
     int x1 = currentPageData.getMediaBoxX(currentPage); 
     int x2 = currentPageData.getMediaBoxWidth(currentPage) + x1; 
     int y1 = p.y + 20; 
     int y2 = p.y - 20; 
     result = currentGrouping.extractTextAsWordlist(x1, y1, x2, y2, currentPage, true, ""); 
    } 
    return result; 
}

: I는 이것을 MouseEvent 중심 페이지 수직 +/- 20 페이지 단위 (이 상당히 중의 임의의 수)의 폭하도록 선택 :

/** 
* Parse a String sequence of: 
* {word1, w1_x1, w1_y1, w1_x2, w1_y2, word2, w2_x1, ...} 
* 
* Into a sequence of Rectangles. 
* 
* @param wordList Word list sequence to parse 
* @return A List of Rectangles 
*/ 
private List<Rectangle> parseWordBounds(List<String> wordList) { 
    List<Rectangle> wordBounds = new LinkedList<Rectangle>(); 
    Iterator<String> wordListIterator = wordList.iterator(); 
    while(wordListIterator.hasNext()) { 
     // sequences are: {word, x1, y1, x2, y2} 
     wordListIterator.next(); // skip the word 
     int x1 = (int) Float.parseFloat(wordListIterator.next()); 
     int y1 = (int) Float.parseFloat(wordListIterator.next()); 
     int x2 = (int) Float.parseFloat(wordListIterator.next()); 
     int y2 = (int) Float.parseFloat(wordListIterator.next()); 
     wordBounds.add(new Rectangle(x1, y2, x2 - x1, y1 - y2)); // in page, not screen coordinates 
    } 
    return wordBounds; 
}

그런 다음 MouseEvent 내에서 떨어졌다 Rectangle하는 확인 : 그냥 어떤 이유로

/** 
* Finds the bounding Rectangle of a word located at a Point. 
* 
* @param p Point to find word bounds 
* @param wordBounds List of word boundaries to search 
* @return A Rectangle that bounds a word and contains a point, or null if 
*   there is no word located at the point 
*/ 
private Rectangle findWordBoundsAtPoint(Point p, List<Rectangle> wordBounds) { 
    Rectangle result = null; 
    for (Rectangle wordBound : wordBounds) { 
     if (wordBound.contains(p)) { 
      result = wordBound; 
      break; 
     } 
    } 
    return result; 
}

이 Rectangle를 강조 표시 메소드에 전달하면 작동하지 않습니다. 그런 다음

/** 
* Contracts a Rectangle to enable it to be highlighted. 
* 
* @return A contracted Highlight Rectangle 
*/ 
private Rectangle contractHighlight(Rectangle highlight){ 
    int x = highlight.x + 1; 
    int y = highlight.y + 1; 
    int width = highlight.width -2; 
    int height = highlight.height - 2; 
    return new Rectangle(x, y, width, height); 
}

난 그냥 하이라이트를 추가하려면이 방법으로 전달 : 일부 땜질 후, 나는 각면에 점으로 Rectangle를 축소해도 문제가 해결 발견 마지막으로

/** 
* Highlights text on the document 
*/ 
private void highlightText(Rectangle highlightRectangle) { 
    pdfDecoder.getTextLines().addHighlights(new Rectangle[]{highlightRectangle}, false, currentPage); 
    pdfDecoder.repaint(); 
}

을 모든 위의 전화는 다음과 같은 편리한 방법으로 제공됩니다.

/** 
* Highlights the word at the given point. 
* 
* @param p Point where word is located 
*/ 
private void highlightWordAtPoint(Point p) { 
    try { 
     Rectangle wordBounds = findWordBoundsAtPoint(p, parseWordBounds(scanForWords(p))); 
     if (wordBounds != null) { 
      highlightText(contractHighlight(wordBounds)); 
     } 
    } catch (PdfException e) { 
     // TODO Auto-generated catch block 
     e.printStackTrace(); 
    } 
}

출처

2012-09-27 08:06:33 lotophage

JPedal - PDF의 한 지점에서 단어 강조 표시

답변

관련 문제