Traveling Salesman Probl * m에게 동적 프로그래밍 솔루션의 속도를 높이기위한 제안 사항?

과제 중 하나가 TSP (Traveling Salesman Problem)를 해결하기 위해 동적 프로그래밍 알고리즘을 구현하는 온라인 코스를 진행 중입니다. 내 Python 구현은 소규모 (~ 5 개 도시)에서 작동하지만 25 개 도시의 '실제'적용에서는 매우 느립니다. 나는 알고리즘의 속도를 높이기위한 제안을 찾고있다.Traveling Salesman Probl * m에게 동적 프로그래밍 솔루션의 속도를 높이기위한 제안 사항?

알고리즘은 발췌에서 설명

동적 프로그래밍 솔루션은 또한 추가적인 참조가 주어진다 http://www.geeksforgeeks.org/travelling-salesman-problem-set-1/에서 설명된다.

할당의 문제 설명은 다음과 같습니다

내가 배열 A의 팬더 DataFrame 물체를 사용하여 의사를 구현했습니다. 세트는 해시 가능하지 않고 인덱스로 사용할 수 없으므로 대신 튜플을 사용하여 세트를 고유 한 표현으로 만들기 위해 정렬 할 때주의해야합니다. 여기 증가 크기의 몇 가지 테스트 케이스와 함께 코드는 다음과 같습니다 테스트에 사용

import functools 
from itertools import combinations 
import numpy as np 
import pandas as pd 
from cached_property import cached_property 
import pytest 


def powerset_list(s): 
    '''Return a list of tuples representing all subsets of s''' 
    return functools.reduce(lambda x, y: x + y, [list(combinations(s, r)) for r in range(len(s)+1)]) 


class Graph(object): 
    def __init__(self, edges): 
     self.edges = edges 

    @cached_property 
    def nodes(self): 
     _nodes = set() 
     for edge in self.edges: 
      u, v, weight = edge 
      _nodes.add(u) 
      _nodes.add(v) 
     return list(_nodes) 

    @cached_property 
    def W(self): 
     '''Matrix of edge weights''' 
     n = len(self.nodes) 
     w = np.full((n, n), np.inf) 
     np.fill_diagonal(w, 0) 
     w = pd.DataFrame(w, index=range(1, n+1), columns=range(1, n+1)) 
     for edge in self.edges: 
      u, v, weight = edge 
      w.set_value(u, v, weight) 
      w.set_value(v, u, weight) 
     return w 

    def tsp(self): 
     '''Solve the traveling salesman problem using a dynamic programming method''' 
     n = len(self.nodes) 
     sets = [(1,) + x for x in powerset_list(range(2, n+1))] 
     A = pd.DataFrame(np.full((len(sets), n), np.inf), index=sets, columns=range(1, n+1)) 
     A.set_value((1,), 1, 0) 
     for m in range(2, n+1): 
      for S in [(1,) + perm for perm in combinations(range(2, n+1), m-1)]: 
       for j in set(S) - set([1]): 
        S_min_j = tuple(sorted(set(S) - set([j]))) 
        A.set_value(S, j, min(A.get_value(S_min_j, k) + self.W.get_value(k, j) for k in S_min_j)) 
     return min(A.get_value(tuple(range(1, n+1)), j) + self.W.get_value(j, 1) for j in range(2, n+1)) 


@pytest.fixture 
def edges_geeksforgeeks(): 
    '''Edges from the example graph on http://www.geeksforgeeks.org/travelling-salesman-problem-set-1/''' 
    return [(1, 2, 10), (1, 3, 15), (1, 4, 20), (2, 3, 35), (2, 4, 25), (3, 4, 30)] 

def test_tsp(edges_geeksforgeeks): 
    graph = Graph(edges_geeksforgeeks) 
    min_cost = graph.tsp() 
    assert min_cost == 80 


def dist(coord1, coord2): 
    return np.linalg.norm(np.array(coord1) - np.array(coord2)) 

def edges_from_coords(filename): 
    with open(filename) as f: 
     coords = [tuple(map(float, line.split())) for line in f.read().splitlines()[1:]] 
    nodes = list(range(1, len(coords) + 1)) 
    coords = dict(zip(nodes, coords)) 
    return [(comb[0], comb[1], dist(coords[comb[0]], coords[comb[1]])) for comb in combinations(nodes, 2)] 

@pytest.mark.parametrize("test_input, expected", [("Hulburd_1.txt", 10.24), ("Hulburd_2.txt", 12.36), ("Hulburd_3.txt", 14.00)]) 
def test_Hulburd(test_input, expected): 
    '''Test data supplied by Eric Hulburd on the course forum''' 
    edges = edges_from_coords(test_input) 
    graph = Graph(edges) 
    min_cost = graph.tsp() 
    assert np.around(min_cost, decimals=2) == expected 

@pytest.fixture 
def edges_cities(): 
    return edges_from_coords('tsp.txt') 

@pytest.mark.skip(reason="This takes too long to run") 
def test_tsp_cities(edges_cities): 
    graph = Graph(edges_cities) 
    min_cost = graph.tsp() 
    print("The minimum cost rounded down to the nearest integer is {}".format(int(np.floor(min_cost)))) 


if __name__ == "__main__": 
    pytest.main([__file__, "-s"])

파일이 Hulburd_1.txt, Hulburd_2.txt, Hulburd_3.txt, 실제 할당, tsp.txt에 대한 기본 파일입니다. 문제는 tsp.txt과 관련된 마지막 (건너 뛴) 테스트가 너무 오래 실행되는 것입니다.

어떻게 알고리즘 속도를 향상시킬 수 있습니까? 코스 포럼에는 비트 마스크와 병렬 처리를 사용하여 ~ 3 분 내에 실행될 수 있다고 말하는 사람들이 있습니다. 또 다른 제안은 튜플을 사용하는 것보다는 배열의 인덱스를 단순화하는 것이 었습니다. 성능을 개선하는 방법을

출처

2017-09-20 Kurt Peek

더 https://codereview.stackexchange.com/에 적합합니까? – SiHa

기본적으로 코드 검토는 기본적으로 작동하는 프로그램 용이지만 16GB RAM이 장착 된 MacBook Pro는 메모리가 부족하여 마지막 테스트 (25 개 도시)를 실행할 수 없습니다. 이 문제를 다루기 쉽도록 근본적인 개선 (예 : 상향식 접근 대신 메모 사용)을해야한다고 생각합니다. –

몇 가지 아이디어 :

대신 튜플이 당신의 부분 집합을 표현하기 위해 32 비트의 int를 사용의

- 당신은 당신이 이전에 필요한 각 단계에는 32 개 이상의 도시
가없는 경우이 충분해야한다 크기의 부분 집합에 대한 계산 된 값 m - 1은 (당신이 크기 m-2, M-3 등의 하위 집합에 대한 모든 값을 저장할 필요가 없습니다) -이 크게 메모리 사용을 줄일 수 있습니다

출처

2017-09-20 21:55:36 algrid

Traveling Salesman Probl * m에게 동적 프로그래밍 솔루션의 속도를 높이기위한 제안 사항?

답변

관련 문제