Negamax의 제약 조건 이해

코드 스 니펫은 tictactoe 게임에서 위치의 bestMove를 계산하기 위해 구성됩니다. for 루프의 조건을 제외하고 코드의 거의 모든 부분이 있습니다. minRating! = LOSING_POSITION입니다. 이 코드는 주어진 의사 코드의 구현에서 비롯된 것입니다.Negamax의 제약 조건 이해

moveT FindBestMove(stateT state, int depth, int & rating) { 
for (*each possible move or until you find a forced win*) { 
*Make the move. 
Evaluate the resulting position, adding one to the depth indicator. 
Keep track of the minimum rating so far, along with the corresponding move. 
Retract the move to restore the original state.* 
} 
*Store the move rating into the reference parameter. 
Return the best move.* 
}

난 당신이 찾을 때까지 강제가 승리 말한다 주어진 코드를 for 루프의 두 번째 조건과 일치 할 수 없었다. 내가

moveT FindBestMove(stateT state, int depth, int & rating) { Vector<moveT> moveList; GenerateMoveList(state, moveList); int nMoves = moveList.size(); if (nMoves == 0) Error("No moves available"); moveT bestMove; int minRating = WINNING_POSITION + 1; for (int i = 0; i < nMoves && minRating != LOSING_POSITION; i++) { moveT move = moveList[i]; MakeMove(state, move); int curRating = EvaluatePosition(state, depth + 1); if (curRating < minRating) { bestMove = move; minRating = curRating; } RetractMove(state, move); } rating = -minRating; return bestMove; } int EvaluatePosition(stateT state, int depth) { int rating; if (GameIsOver(state) || depth >= MAX_DEPTH) { return EvaluateStaticPosition(state); } FindBestMove(state, depth, rating); return rating; }

출처

2012-11-26 motiur

프로그램이 WINNING_POSITION을 할당에서 시작 LOSING_POSITION =이 사실과 그 minRating! 사이의 유사성을 찾을 수를 찾기 위해 노력하고, 움직임을 반복 한 후 minRating에 (내 생각, 상대에 대한 승리) 및 최대 피해로 이동하여 minRating을 최소화하십시오.

EvaluatePosition이 (가) LOSING_POSITION을 반환하면이 모든 이동이 상대방에게 돌아 가지 않으므로 검색을 종료 할 수 있으며이 이동이 최선의 행동으로 간주됩니다.

명백한 LOSING_POSITIONS가없는 경우 알고리즘은 정적 평가에 따라 "최상의"이동을 선택합니다.

출처

2012-11-26 06:44:24 lenik

반복하고 무언가를 명확히하기 위해, 스 니펫에서 얻은 최소값 int curRating = EvaluatePosition (state, depth + 1); 누구, 현재 플레이어 또는 상대방에게 이익이됩니다. 나는 그것이 현재 플레이어의 맥락에서 나온 것이라고 생각할 수 있습니다. 그렇지 않으면 상대방을 잃고 싶지 않을 것입니다. 평가 함수가 LOSING_POSITION 값을 제공 할 때만 루프가 중지됩니다. 맞습니다. 추가하고 싶다면 도움이 될 것입니다. – motiur

가능한 움직임이 없을 때 루프가 멈 춥니 다. 또는 그것이 명백 할 때, 상대방은 자신이 선택한 움직임을 이기지 못합니다. 'curRating'은 낮은 것이 좋습니다. 'minRating'과 동일합니다. 그러나'Evaluate()'가 반환되기 직전에 상대방이 자신의 등급을 반대 방향으로 이동해야하므로 "등급"이 기호를 변경 했으므로 "NegaMax"입니다. – lenik

답변

관련 문제