당신은 translation:
텍스트 노드이 번역에서 저자를 구분하는 것을 사용할 수 있습니다 - 저자는 "translation :"텍스트 노드의 선행 형제, 형제를 따르는 번역자.
저자 :
//text()[contains(., 'translation:')]/preceding-sibling::a[@class='booklink' and contains(@href, '/author/')]/text()
번역 :
//text()[contains(., 'translation:')]/following-sibling::a[@class='booklink' and contains(@href, '/author/')]/text()
근무 샘플 코드 :
from lxml.html import fromstring
data = """
<td>
<a class="booklink" href="/author/43710/Author 1">Author 1</a>
,
<a class="booklink" href="/author/46907/Author 2">Author 2</a>
<br>
translation:
<a class="booklink" href="/author/47669/translator 1">Translator 1</a>
,
<a class="booklink" href="/author/9382/translator 2">Translator 2</a>
</td>"""
root = fromstring(data)
authors = root.xpath("//text()[contains(., 'translation:')]/preceding-sibling::a[@class='booklink' and contains(@href, '/author/')]/text()")
translators = root.xpath("//text()[contains(., 'translation:')]/following-sibling::a[@class='booklink' and contains(@href, '/author/')]/text()")
print(authors)
print(translators)
인쇄 :
['Author 1', 'Author 2']
['Translator 1', 'Translator 2']