scrapyd가 관리하는 스파이더 코드 업데이트

scrapyd가 제어하는 스파이더를 설치/활성화하는 적절한 방법은 무엇입니까?scrapyd가 관리하는 스파이더 코드 업데이트

scrapyd-deploy를 사용하여 새로운 스파이더 버전을 설치합니다. 작업이 현재 실행 중입니다. cancel.json을 사용하여 작업을 중단 한 다음 새 작업을 예약해야합니까?

2017-01-10 Markus

내 자신의 질문에 대답 :

나는 실행중인 모든 거미를 중지 조금 파이썬 스크립트를 썼다. 이 스크립트를 실행 한 후 scrapyd-deploy을 실행 한 다음 내 스파이더를 다시 시작합니다. 이것이 치료 전문가가 할 수있는 방법인지 잘 모르겠지만, 나에게 합리적인 것처럼 보입니다.

import requests 
import sys 
import time 


PROJECT = 'crawler' # replace with your project's name 

resp = requests.get("http://localhost:6800/listjobs.json?project=%s" % PROJECT) 
list_json = resp.json() 
failed = False 

count = len(list_json["running"]) 
if count == 0: 
    print "No running spiders found." 
    sys.exit(0) 

for sp in list_json["running"]: 
    # cancel this spider 
    r = requests.post("http://localhost:6800/cancel.json", data={"project":PROJECT, "job": sp["id"]}) 
    print "Sent cancel request for %s %s" % (sp["spider"], sp["id"]) 
    print "Status: %s" % r.json() 
    if r.json()["status"] != "ok": 
     print "ERROR: Failed to stop spider %s" % sp["spider"] 
     failed = True 

if failed: 
    sys.exit(1) 

# poll running spiders and wait until all spiders are down 
while count: 
    time.sleep(2) 
    resp = requests.get("http://localhost:6800/listjobs.json?project=%s" % PROJECT) 
    count = len(resp.json()["running"]) 
    print "%d spiders still running" % count

이

스크립트 (당신을 맞게 PROJECT에 대한 값을 대체)는 requests 패키지 ( pip install requests)가 필요하다

출처

2017-01-11 15:10:32 Markus

scrapyd가 관리하는 스파이더 코드 업데이트

답변

관련 문제