We study the dynamics of ego-centered views of the Internet. Given a monitor and a fixed set of destinations, one such view is obtained by measuring the routes from a monitor to a set of destinations. This can be performed quickly and with low network load with the tracetree tool¹. Thus, we have a sequence of ego-centric views of Internet. Each view can be represented by a graph G(V,E), where V denotes IP addresses of routers and E is a set of connections between them.
I will be happy to have a visual representation of differences between such views, just like a particle trajectory in kinematics. In order to do this we need to go from graphs to points in an euclidean space Rᵐ. One can use classical multidimensional scaling (MDS)² for construct such set of points in euclidean space Rᵐ. Each point will represent one graph. And because we have a time label for each graph, now we can easily trace the "trajectory of ego-centric views".
But there is a small pitfall: we need to find a good distance function for the graphs, because MDS takes a set of these distances and returns a set of points such that the distances between the points are equal to the original distances between graphs. Here we take a highly general measure of structural similarity namely the Hamming distance between two labeled graphs⁴ G_{i} (V_{i}, E_{i}) and G_{j} (V_{j}, E_{j}):
In other words, this distance equals to symmetric difference between sets of links. Now we are able to compare different graphs and different ego-centric views of Internet and trace the "trajectory of ego-centric views". But we can not see how trajectory looks, simply because of number of dimensions m (it is rougly equals to the number of graphs). We should project our curve in some subspace of Rᵐ One can do it with the well known method: PCA (principal component analysis). We apply PCA and obtain one-dimensional projection to the most informative axes. Here the most informative axes means axes which corresponds to maximum of data variation. We will call this axes _{1}R.
Again, because we know when each graph was obtained we can trace the positions of projection on _{1}R as function of time.
The video shows the evolution of these projections.
Ego-centric views of the Internet exhibit strong and systematic dynamics: more time passed between measurements, more the views are different. In addition, there is a some noise. This noise may be explained by load-balancing or by other non-systematic changes in network topology.
Where the Internet is going?
With what speed? Whether the speed changes over time?
What is the meaning of that?
Moreover, one can use another distance between graphs, e.g. the structural distance³, for capture the real structural changes and not just the changes of addresses of routers.