Tetris Policy Improvement

In this work, the task is to learn a Tetris controller that operates as possible as it can. The baseline of our algorithm is to apply noisy cross entropy method on the policy selection that stochastically generates samples based on previously

Tetris Policy Improvement

In this work, the task is to learn a Tetris controller that operates as possible as it can. The baseline of our algorithm is to apply noisy cross entropy method on the policy selection that stochastically generates samples based on previously