PhD Student in Computer Science
Department of Computer Science
University of Hertfordshire
Hatfield AL10 9AB
Telephone: +44 (0)1707 286500
I joined the Adaptive Systems Research Group at the University of Hertfordshire as a PhD student in October 2007. My principal supervisor is Dr. Daniel Polani, and my secondary supervisor is Prof. Chrystopher Nehaniv.
My research is centered around the perception-action loop formalism, and using information theoretic tools to drive intelligent behaviour. I am particulary interested in what type of structure must exist in a world to encourage adaptivity and learning in an embodied agent, and how that structure may be detected and exploited by such an agent.
My PhD is part-time, so please be patient if you email me and I don't get back to you as quickly as I might. Don't let this put you off contacting me though - I love speaking with fellow researchers.
Anthony, T., Polani, D., and Nehaniv, C. L., (2014). General Self-Motivation and Strategy Identification: Case Studies based on Sokoban and Pac-Man. Computational Intelligence and AI in Games, IEEE Transactions on, 6(1), 1-17. (PDF | Abstract | Bibtex)
In this paper, we use empowerment, a recently introduced biologically inspired measure, to allow an AI player to assign utility values to potential future states within a previously unencountered game without requiring explicit specification of goal states. We further introduce strategic affinity, a method of grouping action sequences together to form “strategies,” by examining the overlap in the sets of potential future states following each such action sequence. We also demonstrate an information-theoretic method of predicting future utility. Combining these methods, we extend empowerment to soft-horizon empowerment which enables the player to select a repertoire of action sequences that aim to maintain anticipated utility. We show how this method provides a proto-heuristic for nonterminal states prior to specifying concrete game goals, and propose it as a principled candidate model for “intuitive” strategy selection, in line with other recent work on “self-motivated agent behavior.” We demonstrate that the technique, despite being generically defined independently of scenario, performs quite well in relatively disparate scenarios, such as a Sokoban-inspired box-pushing scenario and in a Pac-Man-inspired predator game, suggesting novel and principle-based candidate routes toward more general game-playing algorithms.
Anthony, T., Polani, D., and Nehaniv, C. L., (2009). Impoverished Empowerment: 'Meaningful' Action Sequence Generation through Bandwidth Limitation. In Proc. European Conference on Artificial Life 2009, Budapest. Springer. (PDF | Abstract | Bibtex)
Empowerment is a promising concept to begin explaining how some biological organisms may assign a priori value expectations to states in taskless scenarios. Standard empowerment samples the full richness of an environment and assumes it can be fully explored. This may be too aggressive an assumption; here we explore impoverished versions achieved by limiting the bandwidth of the empowerment generating action sequences. It turns out that limited richness of actions concentrate on the ``most important'' ones with the additional benefit that the empowerment horizon can be extended drastically into the future. This indicates a path towards and intrinsic preselection for preferred behaviour sequences and helps to suggest more biologically plausible approaches.
Anthony, T., Polani, D., and Nehaniv, C. L. (2008). On Preferred States of Agents: how Global Structure is reflected in Local Structure. In Bullock, S., Noble, J., Watson, R., and Bedau, M. A., editors, Artificial Life XI: Proceedings of the Eleventh International Conference on the Simulation and Synthesis of Living Systems, pages 25-32. MIT Press, Cambridge, MA. (PDF | Abstract | Bibtex)
We investigate the correlation between the information theoretic measure of empowerment and the graph theoretic measure of closeness centrality, to better understand the structural conditions that must exist in a world for learning and adaptation. We examine both measures in both a simple gridworld scenario, represented as a graph, and on a scale-free graph. We show a strong correlation between the two measures, and discuss the strengths and weaknesses of both. We go on to show how the local measurement of empowerment can in many cases predict a measure for the global measurement of closeness centrality.
My research runs in parallel with the work of a few others in my research group:
- Philippe Capdepuy - Concerned with the perception-action loop and its information theoretic properties along with sensor evolution, with a particular interest in collective behaviour.
- Christoph Salge - Modelling and simulating aspects of social interaction with the help of information theory.
- Sander van Dijk - Using information theory for intelligent agent control.
When not hard at work on my research, I work doing search engine optimisation. I enjoy the same various hobbies as most people (films, reading, friends and family) as well as some less common ones. I study Jeet Kune Do, a martial art and philosophy established by Bruce Lee, and I enjoy photography. Also, I like playing both poker and chess, and am developing a bit of a hobby around web application security for which I've already claimed a couple of Google Security Bounties. I occasionally write on my blog.