Doudizhu should be familiar to everyone. The following article mainly shares with you the relevant information about using Python to solve the endgame of Doudizhu. The article introduces it in very detail and is of certain significance to everyone. The reference learning value, friends who need it can take a look below.
Preface
I believe everyone has played Doudizhu, so the rules will not be introduced again.
Go directly to the endgame picture seen in the circle of friends:
I tried this question when I first saw it I try to crack it manually, and every time I think I have found the farmer’s winning strategy, I end up discovering that the farmer can’t escape. Since manual cracking cannot exhaust all possibilities, we can only use codes to help us solve this problem.
This article will briefly describe how to solve such problems through code. At the end, the final result of the endgame will be announced and the code will be open sourced for everyone to complain about.
minimax
The core idea of the code is minimax. Minimax can be broken down into two parts, mini and max, which mean minimum and maximum respectively.
What is the intuitive understanding? It's a bit like two people A and B playing chess. A can now move chess at N points. Assume that A moves chess at a certain point, so that the board evaluation score of A's move is the highest; but when it is B's turn to play, it will definitely move in the direction that is most unfavorable to A. Go, so that A's next step must follow the trajectory set by B, and cannot reach the highest board score that A estimated at this step in the first step.
It is the same in the game of cards. If the farmer's hand of cards cannot be won by the landlord no matter how he deals with it, then it can be said that the farmer has a winning strategy; otherwise, the farmer will lose.
Core logic
We can use a functionhand_out to simulate a person's card playing process. In real life, if a person wants to play cards, he must know all the cards in his hand: me_pokers, and he also needs to know the cards played in the previous hand: last_hand. If we want to use this function to simulate the playing of two people, we also need to know all the opponent's current cards: enemy_pokers.
The return value of this function is whether I can win when it is my_pokers' turn to play cards. Returns true if it can win, false otherwise.
def hand_out(me_pokers, enemy_pokers, last_hand)
Assuming it is my turn to play, if all the cards in my hand are played, then I will immediately know that I have won; otherwise, if all the opponent's cards are played It's over, and I'm not, then I've failed.
if not me_pokers: return True if not enemy_pokers: return False
Because it's my turn to play cards now, I first need to know all the hand combinations I can play now. Note: This combination includes the strategy of checking (i.e. not playing).
all_hands = get_all_hands(me_pokers)
Now we want to iterate through all possible hand combinations.
First of all, I need to know what card the opponent played in the last hand.
If the opponent chose to check in the previous hand, or did not have the previous hand, then I must not check in this round, but I can play any card
If my opponent played a card in the previous hand, I must play a bigger card than it or choose to check directly (not play a card) in this round
Here comes the key point. After playing my cards or choosing to check, we need to use a recursive call to simulate the opponent's next behavior. If the opponent's next card play cannot win, then my card play this time will definitely win; otherwise, for every card play choice I make, if the opponent can win, then I will lose.
All codes are as follows:
def hand_out(me_pokers, enemy_pokers, last_hand, cache): if not me_pokers: # 我全部过牌,直接获胜 return True if not enemy_pokers: # 对手全部过牌,我失败 return False # 获取我当前可以出的所有手牌组合,包括过牌 all_hands = get_all_hands(me_pokers) # 遍历我的所有出牌组合,进行模拟出牌 for hand in all_hands: # 如果上一轮对手出了牌,则这一轮我必须要出比对手更大的牌 或者 对手上一轮选择过牌,那么我只需出任意牌,但是不能过牌 if (last_hand and can_comb2_beat_comb1(last_hand, hand)) or (not last_hand and hand['type'] != COMB_TYPE.PASS): # 模拟对手出牌,如果对手不能取胜,则我必胜 if not hand_out(enemy_pokers, make_hand(me_pokers, hand), hand, cache): return True # 如果上一轮对手出了牌,但我这一轮选择过牌 elif last_hand and hand['type'] == COMB_TYPE.PASS: # 模拟对手出牌,如果对手不能取胜,则我必胜 if not hand_out(enemy_pokers, me_pokers, None, cache): return True # 如果之前的所有出牌组合均不能必胜,则我必败 return False
Build
The above core logic is clear After that, building a cracker will become very simple.
First of all, we need to use numbers to represent the size of the cards. Here we use 3 to represent 3, 11 to represent J, 12 to represent Q, and so on...
Secondly, we need to find out All combinations of cards in a hand require the get_all_hands
function. The specific implementation is complicated but very simple, so I won’t go into details here.
Then, we also need a card strength judgment function can_comb2_beat_comb1(comb1, comb2)
. This function is used to compare the card strength of the two sets of hands to see if comb2 can defeat comb1. The only thing that needs to be noted is that in the rules of Landlord, except for bombs, all other cards are of equal strength. Only cards of the same type can be compared.
Finally, we need a simulated card playing function make_hand(pokers, hand)
, which is used to find out the remaining hands after playing a hand with pokers in hand. Cards are also very simple to implement, just simply remove those played cards.
Efficiency
Due to the huge number of possible hands in a deck of cards, the number of recursive branches is huge. Therefore, the time overhead is very large, which is factorial level O(N!). According to Stirling's formula, it is approximately O(N^N).
Because there may be many repeated cards, resulting in many repeated recursive calls. So adding a cache can greatly improve efficiency.
That is, the description of our hand, the enemy's hand and the hand in the previous round (str(me_pokers)+str(enemy_pokers)+str(last_hand))
is the key, Store the calculated results in the cache dictionary. The next time you encounter the same situation, you can directly retrieve it from the cache dictionary without repeating the calculation again. The time complexity is optimized to exponential O(C^N).
Result
The result of the code calculation is that farmers have no winning strategy. In other words, as long as the landlords knew how to play, the farmers could not win. Is the class solidification already like this...
Open source
The code is placed on Github: doudizhu_solver, or you can download it locally, under the MIT license, and play as you like.
The above is the detailed content of Teach you a trick to use Python to solve the endgame of Landlords. For more information, please follow other related articles on the PHP Chinese website!