Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
sagemath
GitHub Repository: sagemath/sagelib
Path: blob/master/sage/graphs/base/static_sparse_graph.pyx
4045 views
1
r"""
2
Static Sparse Graphs
3
4
What is the point ?
5
-------------------
6
7
This class implements a Cython (di)graph structure made for efficiency. The
8
graphs are *static*, i.e. no add/remove vertex/edges methods are available, nor
9
can they easily or efficiently be implemented within this data structure.
10
11
The data structure, however, is made to save the maximum amount of computations
12
for graph algorithms whose main operation is to *list the out-neighbours of a
13
vertex* (which is precisely what BFS, DFS, distance computations and the
14
flow-related stuff waste their life on).
15
16
The code contained in this module is written C-style. While Sage needs a class
17
for static graphs (not available today, i.e. 2012-01-13) it is not what we try
18
to address here. The purpose is efficiency and simplicity.
19
20
Author:
21
22
- Nathann Cohen (2011)
23
24
Data structure
25
--------------
26
27
.. MATH::
28
29
\begin{picture}(600,400)(0,0)
30
\multiput(0, 0)(15, 0){31}{\line(0, 1){15}}
31
\multiput(0, 0)(0, 15){2}{\line(1, 0){450}}
32
\put(-70,2){\makebox(0,0)[b]{edges}}
33
\put(-20,-50){\makebox(0,0)[b]{neighbors[0]}}
34
\put(100,-50){\makebox(0,0)[b]{neighbors[1]}}
35
\put(250,-50){\makebox(0,0)[b]{neighbors[i]}}
36
\put(400,-50){\makebox(0,0)[b]{neighbors[n-1]}}
37
\put(500,-30){\makebox(0,0)[b]{neighbors[n]}}
38
\multiput(450, 0)(15, 0){2}{\qbezier[8](0,0)(0,7.5)(0,15)}
39
\multiput(450, 0)(0, 15){2}{\qbezier[8](0,0)(7.5,0)(15,0)}
40
\put(0,0){\makebox(15,15){2}}
41
\put(15,0){\makebox(15,15){3}}
42
\put(30,0){\makebox(15,15){5}}
43
\put(45,0){\makebox(15,15){7}}
44
\put(60,0){\makebox(15,15){8}}
45
\put(75,0){\makebox(15,15){9}}
46
\put(90,0){\makebox(15,15){4}}
47
\put(105,0){\makebox(15,15){8}}
48
\multiput(120, 0)(15, 0){18}{\makebox(15,15){$\cdot$}}
49
\put(390,0){\makebox(15,15){2}}
50
\put(405,0){\makebox(15,15){5}}
51
\put(420,0){\makebox(15,15){8}}
52
\put(435,0){\makebox(15,15){9}}
53
\thicklines
54
\put(-50, 7.5){\vector(1, 0){40}}
55
\put(-18, -28){\vector(2, 3){16}}
56
\put(97, -25){\vector(0, 1){20}}
57
\put(247, -25){\vector(0, 1){20}}
58
\put(397, -25){\vector(0, 1){20}}
59
\put(490, -5){\vector(-2, 1){20}}
60
\end{picture}
61
62
63
The data structure is actually pretty simple and compact. ``short_digraph`` has
64
three fields
65
66
* ``n`` (``int``) -- the number of vertices in the graph.
67
68
* ``edges`` (``unsigned short *``) -- array whose length is the number of
69
edges of the graph.
70
71
* ``neighbors`` (``unsigned short **``) -- this array has size `n+1`, and
72
describes how the data of ``edges`` should be read : the neighbors of
73
vertex `i` are the elements of ``edges`` addressed by
74
``neighbors[i]...neighbors[i+1]-1``. The element ``neighbors[n]``, which
75
corresponds to no vertex (they are numbered from `0` to `n-1`) is present
76
so that it remains easy to enumerate the neighbors of vertex `n-1` : the
77
last of them is the element addressed by ``neighbors[n]-1``.
78
79
80
In the example given above, vertex 0 has 2,3,5,7,8 and 9 as out-neighbors, but
81
not 4, which is an out-neighbour of vertex 1. Vertex `n-1` has 2, 5, 8 and 9 as
82
out-neighbors. `\text{neighbors[n]}` points toward the cell immediately *after*
83
the end of `\text{edges}`, hence *outside of the allocated memory*. It is used
84
to indicate the end of the outneighbors of vertex `n-1`
85
86
**Iterating over the edges**
87
88
This is *the one thing* to have in mind when working with this data structure::
89
90
cdef list_edges(short_digraph g):
91
cdef ushort * v
92
cdef ushort * end
93
cdef int i
94
95
for i in range(g.n):
96
v = g.neighbors[i]
97
end = g.neighbors[i+1]
98
99
while v < end:
100
print "There is an edge from "+str(i)+" to "+str(v[0])
101
v += 1
102
103
**Advantages**
104
105
Three great points :
106
107
* The neighbors of a vertex are contiguous in memory.
108
* Going from one to the other amounts to increasing a pointer.
109
* Storing such graphs is incredibly cheaper than storing Python structures.
110
111
Well, I think it would be hard to have anything more efficient than that to
112
enumerate out-neighbors in sparse graphs ! :-)
113
114
Technical details
115
-----------------
116
117
* When creating a ``fast_digraph`` from a ``Graph`` or ``DiGraph`` named
118
``G``, the `i^{\text{th}}` vertex corresponds to ``G.vertices()[i]``
119
120
* The data structure does not support edge labels
121
122
* In its current implementation (with ``unsigned short`` variables), the
123
data structure can handle graphs with at most 65535 vertices. If
124
necessary, changing it to ``int`` is totally straightforward.
125
126
* Some methods return ``bitset_t`` objets when lists could be
127
expected. There is a very useful ``bitset_list`` function for this kind of
128
problems :-)
129
130
* The codes of this module are well documented, and many answers can be
131
found directly in the code.
132
133
Todo list
134
135
* Adjacency test. The data structure can support it efficiently through
136
dichotomy. It would require to sort the list of edges as it is not done at
137
the moment. Some calls to the C function ``qsort`` would be sufficient.
138
139
Cython functions
140
----------------
141
142
``init_short_digraph(short_digraph g, G)``
143
144
This method initializes ``short_digraph g`` so that it represents ``G``. If
145
``G`` is a ``Graph`` objet (and not a ``DiGraph``), an edge between two
146
vertices `u` and `v` is replaced by two arcs in both directions.
147
148
``int n_edges(short_digraph g)``
149
150
Returns the number of edges in ``g``
151
152
``int out_degree(short_digraph g, int i)``
153
154
Returns the out-degree of vertex `i` in ``g``
155
156
``init_empty_copy(short_digraph dst, short_digraph src)``
157
158
Allocates memory for ``dst`` so that it can contain as many vertices and
159
edges as ``src``. Its content is purely random, though.
160
161
``init_reverse(short_digraph dst, short_digraph src)``
162
163
Initializes ``dst`` so that it represents a copy of ``src`` in which all
164
edges go in the opposite direction.
165
166
``can_be_reached_from(short_digraph g, int src, bitset_t reached)``
167
168
Assuming ``bitset_t reached`` has size at least ``g.n``, this method updates
169
``reached`` so that it represents the set of vertices that can be reached
170
from ``src`` in ``g``.
171
172
``strongly_connected_component_containing_vertex(short_digraph g, short_digraph g_reversed, int v, bitset_t scc)``
173
174
Assuming ``bitset_t reached`` has size at least ``g.n``, this method updates
175
``scc`` so that it represents the vertices of the strongly connected
176
component containing ``v`` in ``g``. The variable ``g_reversed`` is assumed
177
to represent the reverse of ``g``.
178
179
``free_short_digraph(short_digraph g)``
180
181
Free the ressources used by ``g``
182
183
What is this module used for ?
184
------------------------------
185
186
At the moment, it is only used in the :mod:`sage.graphs.distances_all_pairs` module.
187
188
Python functions
189
----------------
190
191
These functions are available so that Python modules from Sage can call the
192
Cython routines this module implements (as they can not directly call methods
193
with C arguments).
194
"""
195
196
include "../../misc/bitset.pxi"
197
198
##############################################################################
199
# Copyright (C) 2010 Nathann Cohen <[email protected]>
200
# Distributed under the terms of the GNU General Public License (GPL)
201
# The full text of the GPL is available at:
202
# http://www.gnu.org/licenses/
203
##############################################################################
204
205
from sage.graphs.base.c_graph cimport CGraph
206
207
cdef int init_short_digraph(short_digraph g, G) except -1:
208
209
# g.n is unsigned short, so -1 is actually the maximum value possible.
210
g.n = -1
211
212
if G.order() > g.n:
213
raise ValueError("This structure can handle at most "+str(<int> g.n)+" vertices !")
214
else:
215
g.n = G.order()
216
217
cdef int isdigraph
218
219
from sage.graphs.all import Graph, DiGraph
220
221
if isinstance(G, DiGraph):
222
isdigraph = 1
223
elif isinstance(G, Graph):
224
isdigraph = 0
225
else:
226
raise ValueError("The source graph must be either a DiGraph or a Graph object !")
227
228
cdef list vertices = G.vertices()
229
cdef dict v_to_id = {}
230
cdef ushort i,j
231
232
cdef int n_edges = G.size() if isdigraph else 2*G.size()
233
234
for i, v in enumerate(vertices):
235
v_to_id[v] = i
236
237
g.edges = <ushort *> sage_malloc(n_edges*sizeof(ushort))
238
if g.edges == NULL:
239
raise ValueError("Problem while allocating memory (edges)")
240
241
g.neighbors = <ushort **> sage_malloc((g.n+1)*sizeof(ushort *))
242
if g.neighbors == NULL:
243
raise ValueError("Problem while allocating memory (neighbors)")
244
245
246
# Initializing the value of neighbors
247
g.neighbors[0] = g.edges
248
249
cdef CGraph cg = <CGraph> G._backend
250
for i in range(1,g.n+1):
251
g.neighbors[i] = g.neighbors[i-1] + <int> (cg.out_degree(vertices[i-1]) if isdigraph else G.degree(vertices[i-1]))
252
253
for u,v in G.edge_iterator(labels = False):
254
i = v_to_id[u]
255
j = v_to_id[v]
256
257
g.neighbors[i][0] = j
258
g.neighbors[i] += 1
259
260
if not isdigraph:
261
g.neighbors[j][0] = i
262
g.neighbors[j] += 1
263
264
# Reinitializing the value of neighbors
265
for g.n> i >0:
266
g.neighbors[i] = g.neighbors[i-1]
267
268
g.neighbors[0] = g.edges
269
270
cdef inline int n_edges(short_digraph g):
271
# The number of edges is nothing but a difference of pointers
272
return <int> (g.neighbors[g.n]-g.edges)
273
274
cdef inline int out_degree(short_digraph g, int i):
275
# The out-degree is nothing but a difference of pointers
276
return <int> (g.neighbors[i+1]-g.neighbors[i])
277
278
cdef int init_empty_copy(short_digraph dst, short_digraph src) except -1:
279
dst.n = src.n
280
281
dst.edges = <ushort *> sage_malloc(n_edges(src)*sizeof(ushort))
282
if dst.edges == NULL:
283
raise ValueError("Problem while allocating memory (edges)")
284
285
dst.neighbors = <ushort **> sage_malloc((src.n+1)*sizeof(ushort *))
286
if dst.neighbors == NULL:
287
raise ValueError("Problem while allocating memory (neighbors)")
288
289
cdef int init_reverse(short_digraph dst, short_digraph src) except -1:
290
# Allocates memory for dst
291
init_empty_copy(dst, src)
292
293
# Avoiding a later segfault
294
if dst.n == 0:
295
return 0
296
297
#### 1/3
298
#
299
# In a first pass, we count the in-degrees of each vertex and store it in a
300
# vector. With this information, we can initialize dst.neighbors to its
301
# correct value. The content of dst.edges is not touched at this level.
302
303
cdef int * in_degree = <int *> sage_malloc(src.n*sizeof(int))
304
if in_degree == NULL:
305
raise ValueError("Problem while allocating memory (in_degree)")
306
307
# Counting the degrees
308
memset(in_degree, 0, src.n*sizeof(int))
309
310
cdef ushort * v = src.edges
311
cdef ushort * end = src.neighbors[src.n]
312
while v < end:
313
in_degree[v[0]] += 1
314
v += 1
315
316
# Updating dst.neighbors
317
cdef int i
318
dst.neighbors[0] = dst.edges
319
for i in range(1, src.n+1):
320
dst.neighbors[i] = dst.neighbors[i-1] + in_degree[i-1]
321
sage_free(in_degree)
322
323
#### 2/3
324
#
325
# Second pass : we list the edges again, and add them in dst.edges. Doing
326
# so, we will change the value of dst.neighbors, but that is not so bad as
327
# we can fix it afterwards.
328
for i in range(0, src.n):
329
v = src.neighbors[i]
330
end = src.neighbors[i+1]
331
332
while v < end:
333
dst.neighbors[v[0]][0] = i
334
dst.neighbors[v[0]] += 1
335
v += 1
336
337
#### 3/3
338
#
339
# Final step : set the correct values of dst.neighbors again. It is easy, as
340
# the correct value of dst.neighbors[i] is actually dst.neighbors[i-1]
341
for src.n> i >0:
342
dst.neighbors[i] = dst.neighbors[i-1]
343
dst.neighbors[0] = dst.edges
344
345
cdef int can_be_reached_from(short_digraph g, int src, bitset_t reached) except -1:
346
if g.n == 0:
347
return 0
348
349
# Initializing the set of vertices reached by setting only bit src
350
bitset_set_first_n(reached, 0)
351
bitset_add(reached, src)
352
353
# We will be doing a Depth-First Search. We allocate the stack we need for
354
# that, and put "src" on top of it.
355
cdef ushort * stack = <ushort *> sage_malloc(g.n*sizeof(ushort))
356
if stack == NULL:
357
raise ValueError("Problem while allocating memory (stack)")
358
359
stack[0] = src
360
cdef int stack_size = 1
361
362
# What we need to iterate over the edges
363
cdef int i
364
cdef ushort * v
365
cdef ushort * end
366
367
# Plain old DFS ...
368
#
369
#If there is something left on the stack, we remove it consider each of its
370
# neighbors. If we find any which has not been reached yet, we set its
371
# corresponding bit in the reached bitset, and add it on top of the stack.
372
373
while stack_size:
374
stack_size -= 1
375
i = stack[stack_size]
376
377
v = g.neighbors[i]
378
end = g.neighbors[i+1]
379
380
while v < end:
381
if not bitset_in(reached, v[0]):
382
bitset_add(reached, v[0])
383
stack[stack_size] = v[0]
384
stack_size += 1
385
386
v += 1
387
388
sage_free(stack)
389
390
cdef strongly_connected_component_containing_vertex(short_digraph g, short_digraph g_reversed, int v, bitset_t scc):
391
392
# Computing the set of vertices that can be reached from v in g
393
can_be_reached_from(g, v, scc)
394
395
# Computing the set of vertices that can be reached from v in g *reversed*
396
cdef bitset_t scc_reversed
397
bitset_init(scc_reversed, g.n)
398
can_be_reached_from(g_reversed, v, scc_reversed)
399
400
# The scc containing v is the intersection of both sets
401
bitset_intersection(scc, scc, scc_reversed)
402
403
cdef void free_short_digraph(short_digraph g):
404
405
if g.edges != NULL:
406
sage_free(g.edges)
407
408
if g.neighbors != NULL:
409
sage_free(g.neighbors)
410
411
def strongly_connected_components(G):
412
r"""
413
Returns the strongly connected components of the given DiGraph.
414
415
INPUT:
416
417
- ``G`` -- a DiGraph.
418
419
.. NOTE::
420
421
This method has been written as an attempt to solve the slowness
422
reported in :trac:`12235`. It is not the one used by
423
:meth:`sage.graphs.digraph.DiGraph.strongly_connected_components` as
424
saving some time on the computation of the strongly connected components
425
is not worth copying the whole graph, but it is a nice way to test this
426
module's functions. It is also tested in the doctest or
427
:meth:`sage.graphs.digraph.DiGraph.strongly_connected_components`.
428
429
EXAMPLE::
430
431
sage: from sage.graphs.base.static_sparse_graph import strongly_connected_components
432
sage: g = digraphs.ButterflyGraph(2)
433
sage: strongly_connected_components(g)
434
[[('00', 0)], [('00', 1)], [('00', 2)], [('01', 0)], [('01', 1)], [('01', 2)],
435
[('10', 0)], [('10', 1)], [('10', 2)], [('11', 0)], [('11', 1)], [('11', 2)]]
436
"""
437
438
if G.order() == 0:
439
return [[]]
440
441
# To compute the connected component containing a given vertex v, we take
442
# the intersection of the set of vertices that can be reached from v in G
443
# and the set of vertices that can be reached from v in G reversed.
444
#
445
# That's all that happens here.
446
447
cdef list answer = []
448
cdef list vertices = G.vertices()
449
cdef short_digraph g, gr
450
451
init_short_digraph(g, G)
452
453
init_reverse(gr, g)
454
455
cdef bitset_t seen
456
bitset_init(seen, g.n)
457
bitset_set_first_n(seen, 0)
458
459
460
cdef bitset_t scc
461
bitset_init(scc, g.n)
462
bitset_set_first_n(scc, 0)
463
464
cdef int v
465
while bitset_len(seen) < g.n:
466
v = bitset_first_in_complement(seen)
467
strongly_connected_component_containing_vertex(g, gr, v, scc)
468
answer.append([vertices[i] for i in bitset_list(scc)])
469
bitset_union(seen, seen, scc)
470
471
bitset_free(seen)
472
bitset_free(scc)
473
free_short_digraph(g)
474
free_short_digraph(gr)
475
return answer
476
477
478