Agent and AgentGroup#

tomsup.agent#

docstring

class tomsup.agent.Agent(strategy: Optional[str] = None, save_history: bool = False, **kwargs)[source]#

Bases: object

Agent is super (parent) class for creating agents in tomsup.

__init__(strategy: Optional[str] = None, save_history: bool = False, **kwargs)[source]#

Parameters:

strategy (Optional[str], optional) – The strategy of the agent you wish to create. Defaults to None. It is recommended to use create_agents() to create instead of the Agent class
save_history (bool, optional) – Should the history of the agent be saved. Defaults to False.
kwargs (dict) –

get_choice() → int[source]#

Returns:: the agents choice in the previous round
Return type:: int

get_history(key: Optional[str] = None, format: str = 'df') → Union[dict, DataFrame, list][source]#

Return the agents history. This include only information relevant to the agent. E.g. for a Random bias (RB) agent only its choice is saved, while opponent choice is not as it used by the agent.

Parameters:

key (Optional[str], optional) – The item of interest in the history. Defaults to None, which returns all the entire history.
format (str, optional) – Format of the return, options include “list”, “dict” and “df”, which is a pandas dataframe. Defaults to “df”.

Returns:

The history in the specified format

Return type:

Union[dict, pd.DataFrame, list]

get_start_params() → dict[source]#

Returns:: The starting parameters of the agent.
Return type:: dict

get_strategy() → str[source]#

Returns:: The strategy of the agent
Return type:: str

plot_choice(show: bool = True) → None[source]#: Plot the choices of the agent :param show: Should plt.show be run at the end. Defaults to True. :type show: bool, optional

plot_internal(fun: Callable, show: bool = True) → None[source]#

Function for plotting internal states of agent

Parameters:

fun (Callable) – a function which to use to extract from the internal states dict
show (bool) – Should it run plt.show at the end of plotting? Default to True

Examples

>>> # plotting the est. probability of opp. choosing one over trials
>>> tom1.plot_internal(fun=lambda internal_states: internal_states["own_states"]["p_op"])
>>> # plotting the agent belief about its opponents theory of mind level (p_k)
>>> # probability of sophistication level k=0
>>> tom2.plot_internal(fun=lambda internal_states: internal_states["own_states"]["p_k"][0])
>>> # probability of sophistication level k=1
>>> tom2.plot_internal(fun=lambda internal_states: internal_states["own_states"]["p_k"][1])

reset(save_history: Optional[bool] = None)[source]#

resets the agent to its starting parameters

Parameters:: save_history (Optional[bool], optional) – Should the agent history be saved? Defaults to None, which keeps the previous state.

class tomsup.agent.AgentGroup(agents: List[str], start_params: Optional[List[dict]] = None)[source]#

Bases: object

An agent group is a group of agents. It is a utility class to allow for easily setting up tournaments.

Examples

>>> round_table = AgentGroup(agents=['RB']*2,             start_params=[{'bias': 1}]*2)
>>> round_table.agent_names
['RB_0', 'RB_1']
>>> RB_0 = round_table.get_agent('RB_0') # extract an agent
>>> RB_0.bias == 1 # should naturally be 1, as we specified it
True
>>> round_table.set_env('round_robin')
>>> result = round_table.compete(p_matrix="penny_competitive",             n_rounds=100, n_sim=10)
Currently the pair, ('RB_0', 'RB_1'), is competing for 10 simulations,             each containg 100 rounds.
    Running simulation 1 out of 10
    Running simulation 2 out of 10
    Running simulation 3 out of 10
    Running simulation 4 out of 10
    Running simulation 5 out of 10
    Running simulation 6 out of 10
    Running simulation 7 out of 10
    Running simulation 8 out of 10
    Running simulation 9 out of 10
    Running simulation 10 out of 10
Simulation complete
>>> result.shape[0] == 10*100 # As there is 10 simulations each containing                                        100 round
True
>>> result['payoff_agent0'].mean() == 1  # Given that both agents have             always choose 1, it is clear that agent0 always win, when playing the             competitive pennygame
True

__init__(agents: List[str], start_params: Optional[List[dict]] = None)[source]#

Parameters:

agents (List[str]) – A list of agents
start_params (Optional[List[dict]], optional) – The starting parameters of the agents specified as a dictionary pr. agent. Defaults to None, indicating default for all agent. Use empty to use default of an agent.

compete(p_matrix: PayoffMatrix, n_rounds: int = 10, n_sim: int = 1, reset_agent: bool = True, env: Optional[str] = None, save_history: bool = False, verbose: bool = True, n_jobs: Optional[int] = None) → DataFrame[source]#

for each pair competes using the specified parameters

Parameters:

p_matrix (PayoffMatrix) – The payoffmatrix in which the agents compete
n_rounds (int, optional) – Number of rounds the agent should play in each simulation. Defaults to 10.
n_sim (int, optional) – The number of simulations. Defaults to 1.
reset_agent (bool, optional) – Should the agent be reset ? Defaults to True.
env (Optional[str], optional) – The environment in which the agent should compete. Defaults to None, indicating the already set environment.
save_history (bool, optional) – Should the history of agent be saved. Defaults to False, as this is memory intensive.
verbose (bool, optional) – Toggles the verbosity of the function. Defaults to True.
n_jobs (Optional[int], optional) – Number of parallel jobs. Defaults to None, indicating no parallelization. -1 indicate as many jobs as there is cores on your unit.

Returns:

A pandas dataframe of the results.

Return type:

pd.DataFrame

get_agent(agent: str) → Agent[source]#

get_environment()[source]#

Returns:: the pairing resulted from the set environment

get_environment_name() → str[source]#

Returns:: The name of the set environment
Return type:: str

get_names() → List[str][source]#

Returns:: the names of the agents
Return type:: List[str]

get_results() → DataFrame[source]#

Returns:: The results
Return type:: pd.DataFrame

plot_choice(agent0: str, agent1: str, agent: int = 0, sim: Optional[int] = None, plot_individual_sim: bool = False, show: bool = True)[source]#

plots the choice of an agent in a defined agent pair

Parameters:

agent0 (str) – The name of agent0
agent1 (str) – The name of agent1
agent (int, optional) – An int denoting which of agent 0 or 1 you should plot. Defaults to 0.
plot_individual_sim (bool, optional) – Should you plot each individual simulation. Defaults to False.
show (bool, optional) – Should plt.show be run at the end. Defaults to True.

plot_heatmap(aggregate_col: str = 'payoff_agent', aggregate_fun: ~typing.Callable = <function mean>, certainty_fun: ~typing.Union[~typing.Callable, str] = 'mean_ci_95', cmap: str = 'Blues', na_color: str = 'xkcd:white', xlab: str = 'Agent', ylab: str = 'Opponent', cbarlabel: str = 'Average score of the agent', show: bool = True)[source]#

plot a heatmap of the results.

Parameters:

aggregate_col (str, optional) – The column to aggregate on. Defaults to “payoff_agent”.
aggregate_fun (Callable, optional) – The function to aggregate by. Defaults to np.mean.
certainty_fun (Union[Callable, str], optional) – The certainty function specified as a string on the form “mean_ci_X” where X denote the confidence interval, or a function. Defaults to “mean_ci_95”.
cmap (str, optional) – The color map. Defaults to “Blues”.
na_color (str, optional) – The color of NAs. Defaults to “xkcd:white”, e.g. white.
xlab (str, optional) – The name on the x-axis. Defaults to “Agent”.
ylab (str, optional) – The name of the y-axis. Defaults to “Opponent”.
show (bool, optional) – Should plt.show be run at the end. Defaults to True.

plot_history(agent0: int, agent1: int, state: str, agent: int = 0, fun: ~typing.Callable = <function AgentGroup.<lambda>>, ylab: str = '', xlab: str = 'Round', show: bool = True)[source]#

Plots the history of an agent in a defined agent pair

Parameters:

agent0 (str) – The name of agent0
agent1 (str) – The name of agent1
agent (int, optional) – An int denoting which of agent 0 or 1 you should plot. Defaults to 0.
state (str) – The state of the agent you wish to plot.
fun (Callable, optional) – A function for extracting the state. Defaults to lambdax:x[state].
xlab (str, optional) – The name on the x-axis. Defaults to “Agent”.
ylab (str, optional) – The name of the y-axis. Defaults to “Opponent”.
show (bool, optional) – Should plt.show be run at the end. Defaults to True.

plot_op_states(agent0: str, agent1: str, state: str, level: int = 0, agent: int = 0, show: bool = True)[source]#

plots the p_self of a k-ToM agent in a defined agent pair

Parameters:

agent0 (str) – The name of agent0
agent1 (str) – The name of agent1
agent (int, optional) – An int denoting which of agent 0 or 1 you should plot. Defaults to 0.
state (str) – a state of the simulated opponent you wish to plot.
level (str) – level of the similated opponent you wish to plot.
show (bool, optional) – Should plt.show be run at the end. Defaults to True.

plot_p_k(agent0: str, agent1: str, level: int, agent: int = 0, show: bool = True)[source]#

plots the p_k of a k-ToM agent in a defined agent pair

Parameters:

agent0 (str) – The name of agent0
agent1 (str) – The name of agent1
agent (int, optional) – An int denoting which of agent 0 or 1 you should plot. Defaults to 0.
show (bool, optional) – Should plt.show be run at the end. Defaults to True.

plot_p_op_1(agent0: str, agent1: str, agent: int = 0, show: bool = True) → None[source]#

plots the p_op_1 of a k-ToM agent in a defined agent pair

Parameters:

agent0 (str) – The name of agent0
agent1 (str) – The name of agent1
agent (int, optional) – An int denoting which of agent 0 or 1 you should plot. Defaults to 0.
show (bool, optional) – Should plt.show be run at the end. Defaults to True.

plot_p_self(agent0: str, agent1: str, agent: int = 0, show: bool = True)[source]#

plots the p_self of a k-ToM agent in a defined agent pair

Parameters:

agent0 (str) – The name of agent0
agent1 (str) – The name of agent1
agent (int, optional) – An int denoting which of agent 0 or 1 you should plot. Defaults to 0.
show (bool, optional) – Should plt.show be run at the end. Defaults to True.

plot_score(agent0: str, agent1: str, agent: int = 0, show: bool = True)[source]#

plots the score of an agent in a defined agent pair

Parameters:

agent0 (str) – The name of agent0
agent1 (str) – The name of agent1
agent (int, optional) – An int denoting which of agent 0 or 1 you should plot. Defaults to 0.
show (bool, optional) – Should plt.show be run at the end. Defaults to True.

plot_tom_op_estimate(agent0: int, agent1: int, level: int, estimate: str, agent: int = 0, plot: str = 'mean', transformation: Optional[bool] = None, show: bool = True)[source]#

plot a k-ToM’s estimates the opponent in a given pair

Parameters:

agent0 (str) – The name of agent0
agent1 (str) – The name of agent1
agent (int, optional) – An int denoting which of agent 0 or 1 you should plot. Defaults to 0.
estimate (str) – The desired estimate to plot options include: “volatility”, “behav_temp” (Behavoural Temperature), “bias”, “dilution”.
level (str) – Sophistication level of the similated opponent you wish to plot.
plot (str, optional) – Toggle between plotting mean (“mean”) or variance (“var”). Default to “mean”.
show (bool, optional) – Should plt.show be run at the end. Defaults to True.

set_env(env: str) → None[source]#

Set environment of the agent group.

Parameters:: env (str) – The string for the environment you wish to set. Valid environment strings include: ‘round_robin’: Matches all participant against all others ‘random_pairs’: Combines the agent in random pairs (the number of agent must be even)

class tomsup.agent.QL(learning_rate: float = 0.5, b_temp: float = 0.01, expec_val: Tuple[float, float] = (0.5, 0.5), **kwargs)[source]#

Bases: Agent

‘QL’: The Q-learning model by Watkinns (1992)

__init__(learning_rate: float = 0.5, b_temp: float = 0.01, expec_val: Tuple[float, float] = (0.5, 0.5), **kwargs)[source]#

Parameters:

learning_rate (float, optional) – The degree to which the agent learns. If the learning rate 0 the agent will not learn. Defaults to 0.5.
b_temp (float, optional) – The behavioural temperature of the Q-Learning agent. Defaults to 0.01.
expec_val (Tuple[float, float], optional) – The preference for choice 0 and 1. Defaults to (0.5, 0.5).

compete(op_choice: ~typing.Optional[int], p_matrix: ~tomsup.payoffmatrix.PayoffMatrix, agent=<class 'int'>, **kwargs) → int[source]#

Parameters:

op_choice (Optional[int]) – The choice of the opponent, should be None in the first round.
p_matrix (PayoffMatrix) – The payoff matrix which the agent plays in
agent (int) – The agent role (either 0, 1) in which the agent have in the payoff matrix.

Returns:

The choice of the agent

Return type:

int

get_expected_values() → Tuple[float, float][source]#

Returns:: The preference for choice 0 and 1.
Return type:: Tuple[float, float]

get_learning_rate() → float[source]#

Returns:: The learning rate of the agent
Return type:: float

class tomsup.agent.RB(bias: float = 0.5, **kwargs)[source]#

Bases: Agent

‘RB’: Random bias agent

Examples

>>> rb = ts.agent.RB(bias = 1, save_history = True)
>>> rb.compete()
1
>>> rb.get_start_params()
{'bias': 1, 'save_history': True}
>>> rb.compete()
1
>>> rb.get_history(key='choice', format="list")
[1, 1]

__init__(bias: float = 0.5, **kwargs) → Agent[source]#

Parameters:: bias (float, optional) – The probability of the agent to choose 1. Defaults to 0.5.

compete(**kwargs) → int[source]#

Returns:: The choice of the agent
Return type:: int

get_bias() → float[source]#

Returns:: The bias of the agent
Return type:: float

class tomsup.agent.TFT(copy_prob: float = 1, **kwargs)[source]#

Bases: Agent

‘TFT’: Tit-for-Tat is a heuritstic theory of mind strategy. An agent using this strategy will initially cooperate, then subsequently replicate an opponent’s previous action. If the opponent previously was cooperative, the TFT agent is cooperative.

Examples

>>> shelling = ts.agent.TFT()
>>> p_dilemma = ts.PayoffMatrix(name="prisoners_dilemma")
>>> shelling.compete(op_choice=1, p_matrix=p_dilemma)
1
>>> shelling.compete(op_choice=0, p_matrix=p_dilemma)
0

__init__(copy_prob: float = 1, **kwargs) → Agent[source]#

Parameters:: copy_prob (float, optional) – the probability that the TFT copies the behaviour of the opponent, hereby introducing noise to the original TFT strategy by Shelling (1981). Defaults to 1.
Returns:: The TFT agent
Return type:: Agent

compete(op_choice: ~typing.Optional[int] = None, p_matrix: ~tomsup.payoffmatrix.PayoffMatrix = <tomsup.payoffmatrix.PayoffMatrix object>, verbose: bool = True, **kwargs) → int[source]#

Parameters:

op_choice (Optional[int]) – The choice of the opponent, should be None in the first round.
p_matrix (PayoffMatrix) – The payoff matrix which the agent plays in. Defualt to the prisoners dilemma.
agent (int) – The agent role (either 0, 1) in which the agent have in the payoff matrix.

Returns:

The choice of the agent

Return type:

int

class tomsup.agent.TOM(level: int, volatility: float = - 2, b_temp: float = - 1, bias: Optional[float] = 0, dilution: Optional[float] = None, init_states: Union[dict, str] = 'default', **kwargs)[source]#

Bases: Agent

This Theory of Mind agent is the variational implementation of the recursive ToM agent initially proposed by Devaine (2014), but have been further developed since.

It recursively estimates its opponent and estimate their beliefs about itself.

__init__(level: int, volatility: float = - 2, b_temp: float = - 1, bias: Optional[float] = 0, dilution: Optional[float] = None, init_states: Union[dict, str] = 'default', **kwargs) → Agent[source]#

Parameters:

level (int) – Sophistication level of the agent.
volatility (float, optional) – Volatility (σ) indicate how much the agent thinks the opponent might shift their parameters over time. Volatility is a number in the range (0,∞), but for for computational reasons is inputted on a log scale. I.e. if you want to have a volatility of 0.13 you should input ts.log(0.13) ≈ -2. default is -2 as this was used in the initial implementation of the model.
b_temp (float, optional) – The behavioural temperature (also called the exploration temperature) indicates how noisy the k-ToM decision process is. Behavioural temperature Is a number in the range (0,∞), but for for computational reasons is inputted on a log odds scale. I.e. to have a temperature of 0.37 you should input ts.log(0.37) ≈ -1. Default is -1 as this was used in the initial implementation of the model.
bias (Optional[float], optional) – The Bias indicates the preference k-ToM decision to choose 1. It is added to the expected payoff. I.e. if the expected payoff of choosing 1 is -1 and bias is +2 the updated ‘expected payoff’ would be +1. Defaults to 0.
dilution (Optional[float], optional) – The dilution indicates the degree to which beliefs about the opponent’s sophistication level are forgotten over time. dilution is a number in the range (0, 1), but for for computational reasons is inputted on a log odds scale. I.e. to have a dilution of 0.62 you should input ts.inv_logit(0.62) ≈ 0.5. Default is None as this was used in the initial implementation of the model and this there is no dilution of its beliefs about its oppoenent’s sophistication level.. Defaults to None.
init_states (Union[dict, str], optional) – The initialization states of the agent. Defaults to “default”. See tutorial on setting initialization states for more info.

Returns:

The k-ToM agent

Return type:

Agent

compete(p_matrix: PayoffMatrix, agent: int, op_choice: Optional[int] = None) → int[source]#

Parameters:

op_choice (Optional[int]) – The choice of the opponent, should be None in the first round.
p_matrix (PayoffMatrix) – The payoff matrix which the agent plays in
agent (int) – The agent role (either 0, 1) in which the agent have in the payoff matrix.

Returns:

The choice of the agent

Return type:

int

get_behav_temperature() → float[source]#

Returns:: The behavioural temperature of the agent
Return type:: float

get_bias() → Optional[float][source]#

Returns:: The bias of the agent
Return type:: Optional[float]

get_dilution() → Optional[float][source]#

Returns:: The dilution of the agent
Return type:: Optional[float]

get_internal_states() → dict[source]#

Returns:: The current internal states of the agent
Return type:: dict

get_level() → float[source]#

Returns:: The sophistication level of the agent
Return type:: float

get_parameters() → dict[source]#

Returns:: The agents parameters
Return type:: dict

get_volatility() → float[source]#

Returns:: The volatility of the agent
Return type:: float

print_internal(keys: Optional[list] = None, level: Optional[list] = None)[source]#

prints the internal states of the agent.

Explanation of internal states: opponent_states: indicate that the following states belong to the simulated opponent own_states: indicate that the states belongs to the agent itself p_k: is the estimated sophistication level of the opponent. p_op_mean: The mean estimate of the opponents choice probability in log odds. param_mean: the mean estimate of opponent parameters (in the scale used for the given parameter). If estimating another k-ToM the order of estimates is 1) Volatility, 2) Behavioural temperature, 3) Dilution, 4) Bias. Note that bias is 3) if Dilution is not estimated. param_var: the variance in log scale (same order as in param_mean) gradient: the local-linear gradient for each estimate (same order as in param_mean) p_self: the probability of the agent itself choosing 1 p_op: the aggregate probability of the opponent choosing 1

Parameters:

keys (Optional[list], optional) – The keys which you desire to print. Defaults to None.
level (Optional[list], optional) – List of integers containing levels to print None indicate all levels will be printed. Defaults to None.

print_parameters(keys: Optional[list] = None)[source]#

Parameters:: keys (Optional[list], optional) – The key which you wish to print. Defaults to None, indicating all.

set_internal_states(internal_states: dict) → None[source]#

Parameters:: internal_states (dict) – The desired internal states of the agent.

class tomsup.agent.WSLS(prob_stay: float = 1, prob_switch: float = 1, **kwargs)[source]#

Bases: Agent

‘WSLS’: Win-stay, lose-switch is an agent which employ a simple heuristic. It simply takes the same choice if it wins and switches when it looses.

Examples

>>> sigmund = WSLS()
>>> sigmund.choice = 0  # Manually setting choice
>>> penny = PayoffMatrix(name="penny_competitive")
>>> sigmund.compete(op_choice=1, p_matrix=penny)
0
>>> sigmund.choice = 1  # Manually setting choice
>>> sigmund.compete(op_choice=0)
0

__init__(prob_stay: float = 1, prob_switch: float = 1, **kwargs) → Agent[source]#

Parameters:

prob_stay (float, optional) – he probability to stay if the agent wins. Defaults to 1.
prob_switch (float, optional) – The probability to switch if the agent loose. Defaults to 1.

Returns:

The WSLS agent

Return type:

Agent

compete(op_choice: Optional[int], p_matrix: PayoffMatrix, agent: int, **kwargs) → int[source]#

Parameters:

op_choice (Optional[int]) – The choice of the opponent, should be None in the first round.
p_matrix (PayoffMatrix) – The payoff matrix which the agent plays in
agent (int) – The agent role (either 0, 1) in which the agent have in the payoff matrix.

Returns:

The choice of the agent

Return type:

int

tomsup.agent.compete(agent_0: Agent, agent_1: Agent, p_matrix: PayoffMatrix, n_rounds: int = 1, n_sim: Optional[int] = None, reset_agent: bool = True, return_val: str = 'df', save_history: bool = False, verbose: bool = False, n_jobs: Optional[int] = None)[source]#

Parameters:

agent_0 (Agent) – objects of class Agent which should compete
agent_1 (Agent) – objects of class Agent which should compete
p_matrix (PayoffMatrix) – The payoffmatrix in which the agents compete
n_rounds (int, optional) – Number of rounds the agent should play in each simulation. Defaults to 10.
n_sim (int, optional) – The number of simulations. Defaults to 1.
reset_agent (bool, optional) – Should the agent be reset ? Defaults to True.
save_history (bool, optional) – Should the history of agent be saved. Defaults to False, as this is memory intensive.
return_val (str) – Should values be returns as a pandas dataframe (“df”), or a “list”.
verbose (bool, optional) – Toggles the verbosity of the function. Defaults to True.
n_jobs (Optional[int], optional) – Number of parallel jobs. Defaults to None, indicating no parallelization. -1 indicate as many jobs as there is cores on your unit, i.e. os.cpu_count().

Examples

>>> sirRB = RB(bias = 0.7)
>>> sirWSLS = WSLS()
>>> result = compete(sirRB, sirWSLS, p_matrix = "penny_competitive",             n_rounds = 10)
>>> type(result)
pandas.core.frame.DataFrame
>>> result.columns
Index(['round', 'action_agent0', 'action_agent1', 'payoff_agent0',
'payoff_agent1'],
dtype='object')
>>> result = compete(sirRB, sirWSLS, p_matrix = "penny_competitive",             n_rounds = 10, n_sim = 3, return_val = 'list')
>>> len(result) == 3*10
True
>>> result = compete(sirRB, sirWSLS, p_matrix = "penny_competitive",             n_rounds = 100, n_sim = 3, return_val = 'df', verbose = True)
    Running simulation 1 out of 3
    Running simulation 2 out of 3
    Running simulation 3 out of 3
>>> result['payoff_agent1'].mean() > 0  # We see that the WSLS() on             average win more than it lose vs. the biased agent (RB)
True