2016-11-22 5 views
0

프로그래밍에 익숙하지 않은 나는 계속 자바를 가르쳐왔다. 내가 현재 시도하고있는 것은 특정 옐프 검색에서 주어진 모든 회사의 이름을 추출하여 그 결과를 배열에 저장하는 것입니다. 여기에 내가 간다 :jSoup를 사용하여 옐프 정보를 어떻게 검색합니까?

import java.util.ArrayList; 
import org.jsoup.Jsoup; 
import org.jsoup.nodes.Document; 
import org.jsoup.nodes.Element; 
import org.jsoup.select.Elements; 
import java.io.IOException; 

public class YelpScraper 
{ 
    public static void main(String[] args) throws IOException 
    { 
     String url = "https://www.yelp.com/search?find_desc=&find_loc=new+jersey&ns=1"; 
     Document document = Jsoup.connect(url).get(); 

     Elements elements = document.getElementsByClass("biz-name js-analytics-click"); 

     for (Element element : elements) 
     { 
      System.out.println(elements.toString()); 
     } 
    } 
} 

이제 내 문제는 여기에있다. 당신이 볼 수 있듯이

<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/darios-restaurant-newark" data-hovercard-id="resfu-JNLUKR3l82D5W7-A"><span>Dario’s Restaurant</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/sushi-house-21-newark-2" data-hovercard-id="vMpJRWxm71XSBnWL9XfYpQ"><span>Sushi House 21</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/burger-walla-newark" data-hovercard-id="JmPZ-AyewjQPIJkKbkU0dA"><span>Burger Walla</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hobbys-delicatessen-and-restaurant-newark" data-hovercard-id="-dEkFa3N6SXLahAMBAM8EA"><span>Hobby’s Delicatessen &amp; Restaurant</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/krugs-tavern-newark" data-hovercard-id="YhiUGWjAB1y7reqoKLWCow"><span>Krug’s Tavern</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/mcwhorter-barbecue-newark" data-hovercard-id="6xf4H2rOCtUIhyMgazRsnA"><span>McWhorter Barbecue</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/spanish-tavern-newark" data-hovercard-id="muXH1f3nwoSgWB3KN-rAfA"><span>Spanish Tavern</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/casa-d-paco-newark" data-hovercard-id="iIJ-dWgYcZTewVGJyP6EfQ"><span>Casa d’Paco</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hero-king-handcrafted-sandwiches-newark" data-hovercard-id="hzwE2ub1J7fTwJDjTJwksA"><span>Hero King Handcrafted Sandwiches</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/the-green-chicpea-newark-2" data-hovercard-id="bDWWtSm-8uoW9_urjMCzTA"><span>The Green Chicpea</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/darios-restaurant-newark" data-hovercard-id="resfu-JNLUKR3l82D5W7-A"><span>Dario’s Restaurant</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/sushi-house-21-newark-2" data-hovercard-id="vMpJRWxm71XSBnWL9XfYpQ"><span>Sushi House 21</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/burger-walla-newark" data-hovercard-id="JmPZ-AyewjQPIJkKbkU0dA"><span>Burger Walla</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hobbys-delicatessen-and-restaurant-newark" data-hovercard-id="-dEkFa3N6SXLahAMBAM8EA"><span>Hobby’s Delicatessen &amp; Restaurant</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/krugs-tavern-newark" data-hovercard-id="YhiUGWjAB1y7reqoKLWCow"><span>Krug’s Tavern</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/mcwhorter-barbecue-newark" data-hovercard-id="6xf4H2rOCtUIhyMgazRsnA"><span>McWhorter Barbecue</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/spanish-tavern-newark" data-hovercard-id="muXH1f3nwoSgWB3KN-rAfA"><span>Spanish Tavern</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/casa-d-paco-newark" data-hovercard-id="iIJ-dWgYcZTewVGJyP6EfQ"><span>Casa d’Paco</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hero-king-handcrafted-sandwiches-newark" data-hovercard-id="hzwE2ub1J7fTwJDjTJwksA"><span>Hero King Handcrafted Sandwiches</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/the-green-chicpea-newark-2" data-hovercard-id="bDWWtSm-8uoW9_urjMCzTA"><span>The Green Chicpea</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/darios-restaurant-newark" data-hovercard-id="resfu-JNLUKR3l82D5W7-A"><span>Dario’s Restaurant</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/sushi-house-21-newark-2" data-hovercard-id="vMpJRWxm71XSBnWL9XfYpQ"><span>Sushi House 21</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/burger-walla-newark" data-hovercard-id="JmPZ-AyewjQPIJkKbkU0dA"><span>Burger Walla</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hobbys-delicatessen-and-restaurant-newark" data-hovercard-id="-dEkFa3N6SXLahAMBAM8EA"><span>Hobby’s Delicatessen &amp; Restaurant</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/krugs-tavern-newark" data-hovercard-id="YhiUGWjAB1y7reqoKLWCow"><span>Krug’s Tavern</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/mcwhorter-barbecue-newark" data-hovercard-id="6xf4H2rOCtUIhyMgazRsnA"><span>McWhorter Barbecue</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/spanish-tavern-newark" data-hovercard-id="muXH1f3nwoSgWB3KN-rAfA"><span>Spanish Tavern</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/casa-d-paco-newark" data-hovercard-id="iIJ-dWgYcZTewVGJyP6EfQ"><span>Casa d’Paco</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hero-king-handcrafted-sandwiches-newark" data-hovercard-id="hzwE2ub1J7fTwJDjTJwksA"><span>Hero King Handcrafted Sandwiches</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/the-green-chicpea-newark-2" data-hovercard-id="bDWWtSm-8uoW9_urjMCzTA"><span>The Green Chicpea</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/darios-restaurant-newark" data-hovercard-id="resfu-JNLUKR3l82D5W7-A"><span>Dario’s Restaurant</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/sushi-house-21-newark-2" data-hovercard-id="vMpJRWxm71XSBnWL9XfYpQ"><span>Sushi House 21</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/burger-walla-newark" data-hovercard-id="JmPZ-AyewjQPIJkKbkU0dA"><span>Burger Walla</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hobbys-delicatessen-and-restaurant-newark" data-hovercard-id="-dEkFa3N6SXLahAMBAM8EA"><span>Hobby’s Delicatessen &amp; Restaurant</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/krugs-tavern-newark" data-hovercard-id="YhiUGWjAB1y7reqoKLWCow"><span>Krug’s Tavern</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/mcwhorter-barbecue-newark" data-hovercard-id="6xf4H2rOCtUIhyMgazRsnA"><span>McWhorter Barbecue</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/spanish-tavern-newark" data-hovercard-id="muXH1f3nwoSgWB3KN-rAfA"><span>Spanish Tavern</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/casa-d-paco-newark" data-hovercard-id="iIJ-dWgYcZTewVGJyP6EfQ"><span>Casa d’Paco</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hero-king-handcrafted-sandwiches-newark" data-hovercard-id="hzwE2ub1J7fTwJDjTJwksA"><span>Hero King Handcrafted Sandwiches</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/the-green-chicpea-newark-2" data-hovercard-id="bDWWtSm-8uoW9_urjMCzTA"><span>The Green Chicpea</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/darios-restaurant-newark" data-hovercard-id="resfu-JNLUKR3l82D5W7-A"><span>Dario’s Restaurant</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/sushi-house-21-newark-2" data-hovercard-id="vMpJRWxm71XSBnWL9XfYpQ"><span>Sushi House 21</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/burger-walla-newark" data-hovercard-id="JmPZ-AyewjQPIJkKbkU0dA"><span>Burger Walla</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hobbys-delicatessen-and-restaurant-newark" data-hovercard-id="-dEkFa3N6SXLahAMBAM8EA"><span>Hobby’s Delicatessen &amp; Restaurant</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/krugs-tavern-newark" data-hovercard-id="YhiUGWjAB1y7reqoKLWCow"><span>Krug’s Tavern</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/mcwhorter-barbecue-newark" data-hovercard-id="6xf4H2rOCtUIhyMgazRsnA"><span>McWhorter Barbecue</span></a> 
 
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/spanish-tavern-newark" data-hovercard-id="muXH1f3nwoSgWB3KN-rAfA"><span>Spanish Tavern</span></a>

, 그것은 그 클래스의 HTML 코드를 출력하고 내가 원하는 것은 단순히 비즈니스의 이름입니다 :이 출력됩니다. 내가 어떻게 다르게 할 수 있는지에 대한 생각. 분명 getElementsByClass() 메서드는 내가 사용해야하는 것이 아닙니다. 고급 친구들에게 감사드립니다!

답변

0

처음부터 요소의 하위를 탐색하거나 더 세부적인 선택을 사용할 수 있습니다. 제목이 포함 된 범위를 반환하도록 선택을 변경하고 span() 메서드를 사용하여 span 태그 안에 텍스트를 반환합니다.

Elements elements = document.select(".indexed-biz-name span"); 
for (Element element : elements) 
{ 
    System.out.println(element.text()); 
} 
+0

안녕하세요. 감사합니다. 매력처럼 작동합니다! 나는 어제 jSoup을 실제로 집어 들었다. 괜찮 으면 select() 메서드의 구문을 설명하면 될까요? 내가 볼 수있는 것에서부터, 당신은 항상 시작하게 될 것입니다. 클래스 이름 뒤에 태그가 오는가? –