Abstract:
Growing interest in modelling complex systems from brains to societies to cities using networks has led to increased efforts to describe generative processes that explain those networks. Recent successes in machine learning have prompted the usage of evolutionary computation, especially genetic programming to evolve computer programs that effectively forage a multidimensional search space to iteratively find better solutions that explain network structure. Symbolic regression contributes to these approaches by replicating network morphologies using both structure and processes, all while not relying on the scientist’s intuition or expertise. It distinguishes itself by introducing a novel formulation of a network generator and a parameter-free fitness function to evaluate the generated network and is found to consistently retrieve synthetically generated growth processes as well as simple, interpretable rules for a range of empirical networks. We extend this approach by modifying generator semantics to create and retrieve rules for time-varying networks. Lexicon to study networks created dynamically in multiple stages is introduced. The framework was improved using methods from the genetic programming toolkit (recombination) and computational improvements (using heuristic distance measures) and used to test the consistency and robustness of the upgrades to the semantics using synthetically generated networks. Using recombination was found to improve retrieval rate and fitness of the solutions. The framework was then used on three empirical datasets - subway networks of major cities, regions of street networks and semantic co-occurrence networks of literature in Artificial Intelligence to illustrate the possibility of obtaining interpretable, decentralised growth processes from complex networks.