Path: blob/master/sage/graphs/base/static_sparse_graph.pyx
4045 views
r"""1Static Sparse Graphs23What is the point ?4-------------------56This class implements a Cython (di)graph structure made for efficiency. The7graphs are *static*, i.e. no add/remove vertex/edges methods are available, nor8can they easily or efficiently be implemented within this data structure.910The data structure, however, is made to save the maximum amount of computations11for graph algorithms whose main operation is to *list the out-neighbours of a12vertex* (which is precisely what BFS, DFS, distance computations and the13flow-related stuff waste their life on).1415The code contained in this module is written C-style. While Sage needs a class16for static graphs (not available today, i.e. 2012-01-13) it is not what we try17to address here. The purpose is efficiency and simplicity.1819Author:2021- Nathann Cohen (2011)2223Data structure24--------------2526.. MATH::2728\begin{picture}(600,400)(0,0)29\multiput(0, 0)(15, 0){31}{\line(0, 1){15}}30\multiput(0, 0)(0, 15){2}{\line(1, 0){450}}31\put(-70,2){\makebox(0,0)[b]{edges}}32\put(-20,-50){\makebox(0,0)[b]{neighbors[0]}}33\put(100,-50){\makebox(0,0)[b]{neighbors[1]}}34\put(250,-50){\makebox(0,0)[b]{neighbors[i]}}35\put(400,-50){\makebox(0,0)[b]{neighbors[n-1]}}36\put(500,-30){\makebox(0,0)[b]{neighbors[n]}}37\multiput(450, 0)(15, 0){2}{\qbezier[8](0,0)(0,7.5)(0,15)}38\multiput(450, 0)(0, 15){2}{\qbezier[8](0,0)(7.5,0)(15,0)}39\put(0,0){\makebox(15,15){2}}40\put(15,0){\makebox(15,15){3}}41\put(30,0){\makebox(15,15){5}}42\put(45,0){\makebox(15,15){7}}43\put(60,0){\makebox(15,15){8}}44\put(75,0){\makebox(15,15){9}}45\put(90,0){\makebox(15,15){4}}46\put(105,0){\makebox(15,15){8}}47\multiput(120, 0)(15, 0){18}{\makebox(15,15){$\cdot$}}48\put(390,0){\makebox(15,15){2}}49\put(405,0){\makebox(15,15){5}}50\put(420,0){\makebox(15,15){8}}51\put(435,0){\makebox(15,15){9}}52\thicklines53\put(-50, 7.5){\vector(1, 0){40}}54\put(-18, -28){\vector(2, 3){16}}55\put(97, -25){\vector(0, 1){20}}56\put(247, -25){\vector(0, 1){20}}57\put(397, -25){\vector(0, 1){20}}58\put(490, -5){\vector(-2, 1){20}}59\end{picture}606162The data structure is actually pretty simple and compact. ``short_digraph`` has63three fields6465* ``n`` (``int``) -- the number of vertices in the graph.6667* ``edges`` (``unsigned short *``) -- array whose length is the number of68edges of the graph.6970* ``neighbors`` (``unsigned short **``) -- this array has size `n+1`, and71describes how the data of ``edges`` should be read : the neighbors of72vertex `i` are the elements of ``edges`` addressed by73``neighbors[i]...neighbors[i+1]-1``. The element ``neighbors[n]``, which74corresponds to no vertex (they are numbered from `0` to `n-1`) is present75so that it remains easy to enumerate the neighbors of vertex `n-1` : the76last of them is the element addressed by ``neighbors[n]-1``.777879In the example given above, vertex 0 has 2,3,5,7,8 and 9 as out-neighbors, but80not 4, which is an out-neighbour of vertex 1. Vertex `n-1` has 2, 5, 8 and 9 as81out-neighbors. `\text{neighbors[n]}` points toward the cell immediately *after*82the end of `\text{edges}`, hence *outside of the allocated memory*. It is used83to indicate the end of the outneighbors of vertex `n-1`8485**Iterating over the edges**8687This is *the one thing* to have in mind when working with this data structure::8889cdef list_edges(short_digraph g):90cdef ushort * v91cdef ushort * end92cdef int i9394for i in range(g.n):95v = g.neighbors[i]96end = g.neighbors[i+1]9798while v < end:99print "There is an edge from "+str(i)+" to "+str(v[0])100v += 1101102**Advantages**103104Three great points :105106* The neighbors of a vertex are contiguous in memory.107* Going from one to the other amounts to increasing a pointer.108* Storing such graphs is incredibly cheaper than storing Python structures.109110Well, I think it would be hard to have anything more efficient than that to111enumerate out-neighbors in sparse graphs ! :-)112113Technical details114-----------------115116* When creating a ``fast_digraph`` from a ``Graph`` or ``DiGraph`` named117``G``, the `i^{\text{th}}` vertex corresponds to ``G.vertices()[i]``118119* The data structure does not support edge labels120121* In its current implementation (with ``unsigned short`` variables), the122data structure can handle graphs with at most 65535 vertices. If123necessary, changing it to ``int`` is totally straightforward.124125* Some methods return ``bitset_t`` objets when lists could be126expected. There is a very useful ``bitset_list`` function for this kind of127problems :-)128129* The codes of this module are well documented, and many answers can be130found directly in the code.131132Todo list133134* Adjacency test. The data structure can support it efficiently through135dichotomy. It would require to sort the list of edges as it is not done at136the moment. Some calls to the C function ``qsort`` would be sufficient.137138Cython functions139----------------140141``init_short_digraph(short_digraph g, G)``142143This method initializes ``short_digraph g`` so that it represents ``G``. If144``G`` is a ``Graph`` objet (and not a ``DiGraph``), an edge between two145vertices `u` and `v` is replaced by two arcs in both directions.146147``int n_edges(short_digraph g)``148149Returns the number of edges in ``g``150151``int out_degree(short_digraph g, int i)``152153Returns the out-degree of vertex `i` in ``g``154155``init_empty_copy(short_digraph dst, short_digraph src)``156157Allocates memory for ``dst`` so that it can contain as many vertices and158edges as ``src``. Its content is purely random, though.159160``init_reverse(short_digraph dst, short_digraph src)``161162Initializes ``dst`` so that it represents a copy of ``src`` in which all163edges go in the opposite direction.164165``can_be_reached_from(short_digraph g, int src, bitset_t reached)``166167Assuming ``bitset_t reached`` has size at least ``g.n``, this method updates168``reached`` so that it represents the set of vertices that can be reached169from ``src`` in ``g``.170171``strongly_connected_component_containing_vertex(short_digraph g, short_digraph g_reversed, int v, bitset_t scc)``172173Assuming ``bitset_t reached`` has size at least ``g.n``, this method updates174``scc`` so that it represents the vertices of the strongly connected175component containing ``v`` in ``g``. The variable ``g_reversed`` is assumed176to represent the reverse of ``g``.177178``free_short_digraph(short_digraph g)``179180Free the ressources used by ``g``181182What is this module used for ?183------------------------------184185At the moment, it is only used in the :mod:`sage.graphs.distances_all_pairs` module.186187Python functions188----------------189190These functions are available so that Python modules from Sage can call the191Cython routines this module implements (as they can not directly call methods192with C arguments).193"""194195include "../../misc/bitset.pxi"196197##############################################################################198# Copyright (C) 2010 Nathann Cohen <[email protected]>199# Distributed under the terms of the GNU General Public License (GPL)200# The full text of the GPL is available at:201# http://www.gnu.org/licenses/202##############################################################################203204from sage.graphs.base.c_graph cimport CGraph205206cdef int init_short_digraph(short_digraph g, G) except -1:207208# g.n is unsigned short, so -1 is actually the maximum value possible.209g.n = -1210211if G.order() > g.n:212raise ValueError("This structure can handle at most "+str(<int> g.n)+" vertices !")213else:214g.n = G.order()215216cdef int isdigraph217218from sage.graphs.all import Graph, DiGraph219220if isinstance(G, DiGraph):221isdigraph = 1222elif isinstance(G, Graph):223isdigraph = 0224else:225raise ValueError("The source graph must be either a DiGraph or a Graph object !")226227cdef list vertices = G.vertices()228cdef dict v_to_id = {}229cdef ushort i,j230231cdef int n_edges = G.size() if isdigraph else 2*G.size()232233for i, v in enumerate(vertices):234v_to_id[v] = i235236g.edges = <ushort *> sage_malloc(n_edges*sizeof(ushort))237if g.edges == NULL:238raise ValueError("Problem while allocating memory (edges)")239240g.neighbors = <ushort **> sage_malloc((g.n+1)*sizeof(ushort *))241if g.neighbors == NULL:242raise ValueError("Problem while allocating memory (neighbors)")243244245# Initializing the value of neighbors246g.neighbors[0] = g.edges247248cdef CGraph cg = <CGraph> G._backend249for i in range(1,g.n+1):250g.neighbors[i] = g.neighbors[i-1] + <int> (cg.out_degree(vertices[i-1]) if isdigraph else G.degree(vertices[i-1]))251252for u,v in G.edge_iterator(labels = False):253i = v_to_id[u]254j = v_to_id[v]255256g.neighbors[i][0] = j257g.neighbors[i] += 1258259if not isdigraph:260g.neighbors[j][0] = i261g.neighbors[j] += 1262263# Reinitializing the value of neighbors264for g.n> i >0:265g.neighbors[i] = g.neighbors[i-1]266267g.neighbors[0] = g.edges268269cdef inline int n_edges(short_digraph g):270# The number of edges is nothing but a difference of pointers271return <int> (g.neighbors[g.n]-g.edges)272273cdef inline int out_degree(short_digraph g, int i):274# The out-degree is nothing but a difference of pointers275return <int> (g.neighbors[i+1]-g.neighbors[i])276277cdef int init_empty_copy(short_digraph dst, short_digraph src) except -1:278dst.n = src.n279280dst.edges = <ushort *> sage_malloc(n_edges(src)*sizeof(ushort))281if dst.edges == NULL:282raise ValueError("Problem while allocating memory (edges)")283284dst.neighbors = <ushort **> sage_malloc((src.n+1)*sizeof(ushort *))285if dst.neighbors == NULL:286raise ValueError("Problem while allocating memory (neighbors)")287288cdef int init_reverse(short_digraph dst, short_digraph src) except -1:289# Allocates memory for dst290init_empty_copy(dst, src)291292# Avoiding a later segfault293if dst.n == 0:294return 0295296#### 1/3297#298# In a first pass, we count the in-degrees of each vertex and store it in a299# vector. With this information, we can initialize dst.neighbors to its300# correct value. The content of dst.edges is not touched at this level.301302cdef int * in_degree = <int *> sage_malloc(src.n*sizeof(int))303if in_degree == NULL:304raise ValueError("Problem while allocating memory (in_degree)")305306# Counting the degrees307memset(in_degree, 0, src.n*sizeof(int))308309cdef ushort * v = src.edges310cdef ushort * end = src.neighbors[src.n]311while v < end:312in_degree[v[0]] += 1313v += 1314315# Updating dst.neighbors316cdef int i317dst.neighbors[0] = dst.edges318for i in range(1, src.n+1):319dst.neighbors[i] = dst.neighbors[i-1] + in_degree[i-1]320sage_free(in_degree)321322#### 2/3323#324# Second pass : we list the edges again, and add them in dst.edges. Doing325# so, we will change the value of dst.neighbors, but that is not so bad as326# we can fix it afterwards.327for i in range(0, src.n):328v = src.neighbors[i]329end = src.neighbors[i+1]330331while v < end:332dst.neighbors[v[0]][0] = i333dst.neighbors[v[0]] += 1334v += 1335336#### 3/3337#338# Final step : set the correct values of dst.neighbors again. It is easy, as339# the correct value of dst.neighbors[i] is actually dst.neighbors[i-1]340for src.n> i >0:341dst.neighbors[i] = dst.neighbors[i-1]342dst.neighbors[0] = dst.edges343344cdef int can_be_reached_from(short_digraph g, int src, bitset_t reached) except -1:345if g.n == 0:346return 0347348# Initializing the set of vertices reached by setting only bit src349bitset_set_first_n(reached, 0)350bitset_add(reached, src)351352# We will be doing a Depth-First Search. We allocate the stack we need for353# that, and put "src" on top of it.354cdef ushort * stack = <ushort *> sage_malloc(g.n*sizeof(ushort))355if stack == NULL:356raise ValueError("Problem while allocating memory (stack)")357358stack[0] = src359cdef int stack_size = 1360361# What we need to iterate over the edges362cdef int i363cdef ushort * v364cdef ushort * end365366# Plain old DFS ...367#368#If there is something left on the stack, we remove it consider each of its369# neighbors. If we find any which has not been reached yet, we set its370# corresponding bit in the reached bitset, and add it on top of the stack.371372while stack_size:373stack_size -= 1374i = stack[stack_size]375376v = g.neighbors[i]377end = g.neighbors[i+1]378379while v < end:380if not bitset_in(reached, v[0]):381bitset_add(reached, v[0])382stack[stack_size] = v[0]383stack_size += 1384385v += 1386387sage_free(stack)388389cdef strongly_connected_component_containing_vertex(short_digraph g, short_digraph g_reversed, int v, bitset_t scc):390391# Computing the set of vertices that can be reached from v in g392can_be_reached_from(g, v, scc)393394# Computing the set of vertices that can be reached from v in g *reversed*395cdef bitset_t scc_reversed396bitset_init(scc_reversed, g.n)397can_be_reached_from(g_reversed, v, scc_reversed)398399# The scc containing v is the intersection of both sets400bitset_intersection(scc, scc, scc_reversed)401402cdef void free_short_digraph(short_digraph g):403404if g.edges != NULL:405sage_free(g.edges)406407if g.neighbors != NULL:408sage_free(g.neighbors)409410def strongly_connected_components(G):411r"""412Returns the strongly connected components of the given DiGraph.413414INPUT:415416- ``G`` -- a DiGraph.417418.. NOTE::419420This method has been written as an attempt to solve the slowness421reported in :trac:`12235`. It is not the one used by422:meth:`sage.graphs.digraph.DiGraph.strongly_connected_components` as423saving some time on the computation of the strongly connected components424is not worth copying the whole graph, but it is a nice way to test this425module's functions. It is also tested in the doctest or426:meth:`sage.graphs.digraph.DiGraph.strongly_connected_components`.427428EXAMPLE::429430sage: from sage.graphs.base.static_sparse_graph import strongly_connected_components431sage: g = digraphs.ButterflyGraph(2)432sage: strongly_connected_components(g)433[[('00', 0)], [('00', 1)], [('00', 2)], [('01', 0)], [('01', 1)], [('01', 2)],434[('10', 0)], [('10', 1)], [('10', 2)], [('11', 0)], [('11', 1)], [('11', 2)]]435"""436437if G.order() == 0:438return [[]]439440# To compute the connected component containing a given vertex v, we take441# the intersection of the set of vertices that can be reached from v in G442# and the set of vertices that can be reached from v in G reversed.443#444# That's all that happens here.445446cdef list answer = []447cdef list vertices = G.vertices()448cdef short_digraph g, gr449450init_short_digraph(g, G)451452init_reverse(gr, g)453454cdef bitset_t seen455bitset_init(seen, g.n)456bitset_set_first_n(seen, 0)457458459cdef bitset_t scc460bitset_init(scc, g.n)461bitset_set_first_n(scc, 0)462463cdef int v464while bitset_len(seen) < g.n:465v = bitset_first_in_complement(seen)466strongly_connected_component_containing_vertex(g, gr, v, scc)467answer.append([vertices[i] for i in bitset_list(scc)])468bitset_union(seen, seen, scc)469470bitset_free(seen)471bitset_free(scc)472free_short_digraph(g)473free_short_digraph(gr)474return answer475476477478