Demonstrate the basic usage of our hironaka repo.

¹⁶⁰ views
License: MIT
ubuntu2004

Kernel: Python 3 (system-wide)

Overview

This is a brief notebook explaining what the package hironaka can do with examples.

Note: We require Python version >= 3.8

Summary

Local resolution of singularity can be transformed into a double-agent game (Hironaka's polyhedral game). Variations of the game have applications in various other problems like singularity theories.
Our purpose is to facilitate the computations around Hironaka's game including a support of machine learning techniques like reinforcement learning.

Quick-start

In [1]:

import os, sys
sys.path.insert(0, os.getcwd()+'/hironaka')
from hironaka.Points import Points

The definition of the game

The original Hironaka game is a game consisting of 2 players. They operate in a non-symmetric fashion. To emphasize the difference, let us call player 1 the "host", player 2 the "agent". For every turn the game has a state, the host makes a move, and the agents makes a move. Their moves change the state and the game goes into the next turn.

A state is represented by a set of points $S\in\mathbb Z^n$ who are the vertices of the Newton polytope $S$ itself.

Every turn,

The host chooses a subset $I\subset \{1,2,\cdots, n\}$ such that $|I|\geq 2$ .
The agent chooses a number $i\in I$ .

$i, I$ together changes the state $S$ to the next according to the following linear change of variables: $x_j \mapsto \begin{cases}x_j, &\qquad\text{if } i\neq j \\ \sum\limits_{k\in I} x_k, &\qquad\text{if }i=j \end{cases},$ for points $(x_1,\cdots,x_n)\in \mathbb Z^n$ . We subsequently apply Newton polytope to the transformed points and only keep the vertices.

A state is terminal if it consists of one single point. In this case, the game will not continue and the host wins. As a result, the host wants to reduce the number of $S$ as quickly as possible, but the agent wants to keep the number of $S$ for as long as possible.

The codes

Points is a wrapper that aims to showcase the basic usage of core classes. Its objects record a set of points in space.

The points are represented by nested lists. All points sit in the same Euclidean space.
One can add distinguished_point parameter to specify the index of a distinguished point. It will be tracked when going through all operations.
Once the Points object is created, one can call .step() to take actions. The two parameters are host_coordinates, agent_coordinates. They are demonstrated in lines ahead.

In [2]:

# Create a set of points (1,0), (0,1)
p = Points([[1,0],[0,1]])

In [3]:

# Try to move the state to the next for the host choice [0,1] and the agent choice 1
# It returns True because the game is successfully transformed to the next state.
p.step([0,1],1)

Out[3]:

True

In [4]:

# Check whether the state is terminal
p.ended

Out[4]:

True

In [5]:

# When the state is already terminal, the operation will not continue
p.step([0,1],1)

Out[5]:

False

Different Hosts and Agents

There are a few implemented hosts and agents. The usage is demonstrated below.

In [6]:

from hironaka.host import RandomHost, Zeillinger
from hironaka.agent import RandomAgent, ChooseFirstAgent
from hironaka.game import GameHironaka
from hironaka.src import generate_points

random_agent = RandomAgent(ignore_batch_dimension=True)
random_host = RandomHost(ignore_batch_dimension=True)
zeillinger = Zeillinger(ignore_batch_dimension=True)
choose_first_agent = ChooseFirstAgent(ignore_batch_dimension=True)  # this guy always chooses the first coordinate

Note: for normal usage, please make sure to set ignore_batch_dimension=True. If batch dimension is utilized, the game will be dealing with multiple separate games as a batch, and Agent, Host takes an extra dimension in the input.

Remember that Points is merely a wrapper for ListPoints that simplifies some APIs. When the usage gets more and more involved, it is recommended to switch to ListPoints and an extra batch dimension must be added outside. For now, Points still does the work.

For a host to make a move, feed it with the Points or ListPoints object. For an agent to make a move, feed it with the Points or ListPoints object plus the host's choice.

Once Agent.move(...) is called, if inplace=False is not specified, the state of the Points or ListPoints object will be modified. Otherwise, only the action will be returned.

In [7]:

p = Points([[0,1],[1,0]])
host_action = random_host.select_coord(p)
print(host_action)
agent_action = random_agent.move(p, host_action)
print(agent_action)
print(p)

Out[7]:

[0, 1]
0
[[1, 0]]

You can pitch them against each other like the following.

In [8]:

import logging

for host in [random_host, zeillinger]:
    for agent in [random_agent, choose_first_agent]:
        points = Points(generate_points(5))

        game = GameHironaka(points, host, agent, scale_observation=False)
        game.logger.setLevel(logging.INFO)
        if not game.logger.hasHandlers():
            game.logger.addHandler(logging.StreamHandler(stream=sys.stdout))
        print(f"{host.__class__.__name__} is playing against {agent.__class__.__name__}")
        while game.step(verbose=1):
            print(game.state)
        game.print_history()

Out[8]:

RandomHost is playing against RandomAgent
[[[38, 17, 1], [30, 16, 23], [15, 21, 39], [13, 46, 16]]]
Host move: [0, 1]
Agent move: 0
Game Ended: False
[[[55, 17, 1], [46, 16, 23], [36, 21, 39]]]
[[[55, 17, 1], [46, 16, 23], [36, 21, 39]]]
Host move: [2, 0]
Agent move: 0
Game Ended: True
Coordinate history (host choices):
[[0, 1], [2, 0]]
Move history (agent choices):
[0, 0]
RandomHost is playing against ChooseFirstAgent
[[[17, 11, 3], [8, 12, 14]]]
Host move: [1, 2]
Agent move: 1
Game Ended: False
[[[17, 14, 3], [8, 26, 14]]]
[[[17, 14, 3], [8, 26, 14]]]
Host move: [2, 0]
Agent move: 0
Game Ended: True
Coordinate history (host choices):
[[1, 2], [2, 0]]
Move history (agent choices):
[1, 0]
Zeillinger is playing against RandomAgent
[[[26, 11, 6], [2, 21, 25]]]
Host move: [2, 0]
Agent move: 2
Game Ended: False
[[[26, 11, 32], [2, 21, 27]]]
[[[26, 11, 32], [2, 21, 27]]]
Host move: [1, 0]
Agent move: 0
Game Ended: False
[[[37, 11, 32], [23, 21, 27]]]
[[[37, 11, 32], [23, 21, 27]]]
Host move: [1, 0]
Agent move: 0
Game Ended: False
[[[48, 11, 32], [44, 21, 27]]]
[[[48, 11, 32], [44, 21, 27]]]
Host move: [1, 2]
Agent move: 2
Game Ended: True
Coordinate history (host choices):
[[2, 0], [1, 0], [1, 0], [1, 2]]
Move history (agent choices):
[2, 0, 0, 2]
Zeillinger is playing against ChooseFirstAgent
[[[39, 15, 38], [32, 7, 48], [25, 32, 29], [14, 10, 39]]]
Host move: [2, 1]
Agent move: 1
Game Ended: False
[[[39, 53, 38], [25, 61, 29], [14, 49, 39]]]
[[[39, 53, 38], [25, 61, 29], [14, 49, 39]]]
Host move: [1, 0]
Agent move: 0
Game Ended: False
[[[92, 53, 38], [86, 61, 29], [63, 49, 39]]]
[[[92, 53, 38], [86, 61, 29], [63, 49, 39]]]
Host move: [1, 2]
Agent move: 1
Game Ended: False
[[[86, 90, 29], [63, 88, 39]]]
[[[86, 90, 29], [63, 88, 39]]]
Host move: [2, 0]
Agent move: 0
Game Ended: False
[[[115, 90, 29], [102, 88, 39]]]
[[[115, 90, 29], [102, 88, 39]]]
Host move: [2, 0]
Agent move: 0
Game Ended: False
[[[144, 90, 29], [141, 88, 39]]]
[[[144, 90, 29], [141, 88, 39]]]
Host move: [2, 0]
Agent move: 0
Game Ended: False
[[[180, 88, 39], [173, 90, 29]]]
[[[180, 88, 39], [173, 90, 29]]]
Host move: [1, 2]
Agent move: 1
Game Ended: True
Coordinate history (host choices):
[[2, 1], [1, 0], [1, 2], [2, 0], [2, 0], [2, 0], [1, 2]]
Move history (agent choices):
[1, 0, 1, 0, 0, 0, 1]

Working with neural networks

Everybody wants to do machine learning nowadays, including us. In our project, we work with PyTorch.

Hosts and Agents who make decisions based on neural networks (nn.Module) should use the classes PolicyHost, PolicyAgentand their subclasses. They take in Policy as a parameter when constructing.

In [9]:

from torch import nn
from hironaka.policy.NNPolicy import NNPolicy
from hironaka.agent import PolicyAgent
from hironaka.host import PolicyHost

# Construct a basic neural network
class NN(nn.Module):
    def __init__(self, input_size: int, dimension: int):
        super(NN, self).__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(input_size, 32),
            nn.ReLU(),
            nn.Linear(32, 32),
            nn.ReLU(),
            nn.Linear(32, dimension),
        )
    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

In [10]:

# Initialize a few neural networks and constructs the host and agent who are based on those models.
nn_h = NN(4 * 11, 4)
nn_a = NN(4 * 11 + 4, 4)
pl_h = NNPolicy(nn_h, mode='host', eval_mode=True, max_num_points=11, dimension=4)
pl_a = NNPolicy(nn_a, mode='agent', eval_mode=True, max_num_points=11, dimension=4)

agent = PolicyAgent(pl_a, ignore_batch_dimension=True)
host = PolicyHost(pl_h, ignore_batch_dimension=True)

One can apply all kinds of creativity to the training of these models. But let us leave the training part to you smart folks and demonstrate how to use neural network players we just defined. They have exactly the same API as before as they are just subclasses of Host and Agent.

In [11]:

from hironaka.core import ListPoints
from hironaka.src import generate_batch_points
import logging

game = GameHironaka(Points(generate_points(11, dimension=4)), host, agent, scale_observation=False)
game.logger.setLevel(logging.INFO)
if not game.logger.hasHandlers():
    game.logger.addHandler(logging.StreamHandler(stream=sys.stdout))

for _ in range(5):
    game.step(verbose=1)
game.print_history()

Out[11]:

[[[40, 11, 38, 22], [28, 32, 1, 2], [22, 8, 46, 13], [20, 8, 14, 28], [7, 8, 24, 30], [5, 32, 22, 33]]]
Host move: [0, 2]
Agent move: 0
Game Ended: False
[[[78, 11, 38, 22], [68, 8, 46, 13], [34, 8, 14, 28], [31, 8, 24, 30], [29, 32, 1, 2], [27, 32, 22, 33]]]
Host move: [0, 3]
Agent move: 0
Game Ended: False
[[[100, 11, 38, 22], [81, 8, 46, 13], [62, 8, 14, 28], [61, 8, 24, 30], [31, 32, 1, 2]]]
Host move: [0, 3]
Agent move: 0
Game Ended: False
[[[122, 11, 38, 22], [94, 8, 46, 13], [90, 8, 14, 28], [33, 32, 1, 2]]]
Host move: [0, 2, 3]
Agent move: 3
Game Ended: False
[[[90, 8, 14, 132], [33, 32, 1, 36]]]
Host move: [0, 1, 3]
Agent move: 3
Game Ended: False
Coordinate history (host choices):
[[0, 2], [0, 3], [0, 3], [0, 2, 3], [0, 1, 3]]
Move history (agent choices):
[0, 0, 0, 3, 3]

Going further

What we present here (Points, Agent, Host, Game, etc) is merely a facade that demonstrates the mathematical part of the project. They only support very basic usages. Admittedly, the structure of the repo has A LOT of rooms to improve. From here, the path branches out to the wilderness of the engineering world. We would like to guide the users to the corresponding modules:

For gym environment support, please take a look at gym_env
TensorPoints saves points in the form of torch.Tensor. Operations are fully vectorized and use only matrix manipulations provided by Pytorch. It does NOT support single-step game wrappers (Agent, Host, Game) as it can simulate a collection of games at the same time (useful in GPU training)
util.FusedGame goes around gym and tries to fuse the host/agent decisions together and directly talking to GPU. It works with TensorPoints.
Custom MCTS and DQN modules are in-progress.

In [0]: