Deep Learning On a quest to solve superintelligence

DeepRTS and ML-Algorithms

In the beginning of 2017, i started researching how to apply tree-search algorithms to real-time-strategy games. microRTS is implementation of a RTS game in its simplest form and allows for research in various areas of machine learning.

I developed numerous Monte-Carlo based tree searches and it gave good results in microRTS. I figured that microRTS was to simple, and i started work on a new engine based on the principles of Warcraft 2. This implementation had a variable complexity level based on which features i enabled in the configuration file.

First version of this game was developed in Python.

When version 1.0 was complete i stumbled upon issues with perfromance. Python was simply not fast enough to support Tree-Search algorithms and yielded only 40~ nodes visited per game frame. For algorithms utilizing the GPU this was not an issue, but CPU bound tasks were a big problem. Starting to optimize the game engine, i utilized Cython which compiles Python to C/C++. Using Cython yielded very good results but at the cost of reduced debugging capabilities. This made further development hard and the engine rewritten in pure C++

The new C++ implementation were much faster and it also embeds Python so that libraries like Tensorflow, Keras and Theano can be utilized for machine learning. Furthermore, it increases the tree-search performance to 10000~ node visits per game frame which is a huge boost from the python implementation.

The game currently has 4 algorithms implemented: Deep Q-Network, Monte-Carle-Tree-Search, Monte-Carlo-Action-Search, Monte-Carle-Search-Direct. Each of the MC algorithms are just different ways of interpreting the score for each node, thus giving very different results.

As we can see from the graph. The plain MCTS outperforms my attempts to make “shortcuts” in the algorithm. MCTSDirect simply skips some intermediate nodes which it attempts to classify as “useless” while MCAS attempts to build a Q-Table of which action should be done to which time.

DQN is a DeepMind’s implementation of a AI for Atari Games. The reason this does not perform well is becuase of the data-representation and depth/size of the network.

 

 

 

%d bloggers like this: