b
은 자손 노드이지만 a
에 원하는 링크가 포함되어 있습니다. 당신은 (내가 xpath
버전 만 잘 알고, 당신이 CSS를 선호하는 것 같다) 일부 하위 패턴 주위를 검색 할 수 있지만,이 대안은 링크 당신이없이 원하는 가져옵니다 또한
#using a stub to facilitate accessing the URLs later with
# an absolute address
stub = 'https://www.beeradvocate.com'
beer <- read_html(paste0(stub, '/lists/top/'))
lnx = beer %>% html_nodes('a') %>% html_attr('href') %>%
#this pattern matches beer profile links --
# the first . is a brewery ID, the second .
# is a beer ID within that brewery
grep('profile/.*/.*/', ., value = TRUE) %>%
paste0(stub, .)
head(lnx)
# [1] "https://www.beeradvocate.com/beer/profile/23222/78820/"
# [2] "https://www.beeradvocate.com/beer/profile/28743/136936/"
# [3] "https://www.beeradvocate.com/beer/profile/28743/146770/"
# [4] "https://www.beeradvocate.com/beer/profile/28743/87846/"
# [5] "https://www.beeradvocate.com/beer/profile/863/21690/"
# [6] "https://www.beeradvocate.com/beer/profile/17981/110635/"
을, 아브 락 사스는 놀라운 맥주입니다 및 산타나 앨범