CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.

| Download

GAP 4.8.9 installation with standard packages -- copy to your CoCalc project to get it

Views: 418346
1
2
8 Parallel computing with SCSCP
3
4
5
8.1 Managing multiple requests
6
7
Using procedure calls explained in the previous section, the user can create
8
several requests to multiple services to execute them in parallel, or to
9
wait until the fastest result will be available.
10
11
8.1-1 SynchronizeProcesses
12
13
SynchronizeProcesses( process1, process2, ..., processN )  function
14
SynchronizeProcesses( proclist )  function
15
Returns: list of records with components object and attributes
16
17
The function collects results of from each process given in the argument,
18
and returns the list, i-th entry of which is the result obtained from the
19
i-th process. The function accepts both one argument that is a list of
20
processes, and arbitrary number of arguments, each of them being a process.
21
22
 Example 
23

24
gap> a:=NewProcess( "WS_Factorial", [10], "localhost", 26133 );
25
< process at localhost:26133 pid=2064 >
26
gap> b:=NewProcess( "WS_Factorial", [20], "localhost", 26134 );
27
< process at localhost:26134 pid=1975 >
28
gap> SynchronizeProcesses(a,b);
29
[ rec( attributes := [ [ "call_id", "localhost:26133:2064:yCWBGYFO" ] ], 
30
 object := 3628800 ), 
31
 rec( attributes := [ [ "call_id", "localhost:26134:1975:yAAWvGTL" ] ], 
32
 object := 2432902008176640000 ) ]
33

34

35
36
8.1-2 FirstProcess
37
38
FirstProcess( process1, process2, ..., processN )  function
39
FirstProcess( proclist )  function
40
Returns: records with components object and attributes
41
42
The function waits for the result from each process given in the argument,
43
and returns the result coming first, terminating all remaining processes at
44
the same time. The function accepts both one argument that is a list of
45
processes, and arbitrary number of arguments, each of them being a process.
46
47
 Example 
48

49
gap> a:=NewProcess( "WS_Factorial", [10], "localhost", 26133 );
50
< process at localhost:26133 pid=2064 >
51
gap> b:=NewProcess( "WS_Factorial", [20], "localhost", 26134 );
52
< process at localhost:26134 pid=1975 >
53
gap>  FirstProcess(a,b); 
54
rec( attributes := [ [ "call_id", "localhost:26133:2064:mdb8RaO2" ] ], 
55
 object := 3628800 )
56

57

58
59
8.1-3 SCSCPservers
60
61
SCSCPservers global variable
62
63
SCSCPservers is a list of hosts and ports to search for SCSCP services
64
(which may be not only represented by GAP services, but also by another
65
SCSCP-compliant systems).
66
67
It is used by parallel skeletons ParQuickWithSCSCP (8.1-4) and
68
ParListWithSCSCP (8.2-1).
69
70
The initial value of this variable is specified in the file
71
scscp/configpar.g and may be reassigned later.
72
73
8.1-4 ParQuickWithSCSCP
74
75
ParQuickWithSCSCP( commands, listargs )  function
76
Returns: record with components object and attributes
77
78
This function is constructed using the FirstProcess (8.1-2). It is useful
79
when it is not known which partcular method is more efficient, because it
80
allows to call in parallel several procedures (given by the list of their
81
names commands) with the same list of arguments listargs (having the same
82
meaning as in EvaluateBySCSCP (6.3-1)) and obtain the result of that
83
procedure call which will be computed faster.
84
85
In the example below we call two factorisation methods from the GAP package
86
FactInt to factorise 2^150+1. The example is selected in such a way that the
87
runtime of these two methods is approximately the same, so you should expect
88
results from both methods in some random order from repeated calls.
89
90
 Example 
91

92
gap> ParQuickWithSCSCP( [ "WS_FactorsECM", "WS_FactorsMPQS" ], [ 2^150+1 ] );
93
rec( attributes := [ [ "call_id", "localhost:26133:53877:GQX8MhC8" ] ],
94
 object := [ [ 5, 5, 5, 13, 41, 61, 101, 1201, 1321, 63901 ],
95
 [ 2175126601, 15767865236223301 ] ] )
96

97

98
99
8.1-5 FirstTrueProcess
100
101
FirstTrueProcess( process1, process2, ..., processN )  function
102
FirstTrueProcess( proclist )  function
103
Returns: list of records
104
105
The function waits for the result from each process given in the argument,
106
and stops waiting as soon as the first true is returned, abandoning all
107
remaining processes. It retuns a list containing a records with components
108
object and attributes at the position corresponding to the process that
109
returned true. If none of the processes returned true, it will return a
110
complete list of procedure call results.
111
112
The function accepts both one argument that is a list of processes, and
113
arbitrary number of arguments, each of them being a process.
114
115
In the first example, the second call returns true:
116
117
 Example 
118

119
gap> a:=NewProcess( "IsPrimeInt", [2^15013-1], "localhost", 26134 );
120
< process at localhost:26134 pid=42554 >
121
gap> b:=NewProcess( "IsPrimeInt", [2^521-1], "localhost", 26133 );
122
< process at localhost:26133 pid=42448 >
123
gap> FirstTrueProcess(a,b); 
124
[ , rec( attributes := [ [ "call_id", "localhost:26133:42448:Lz1DL0ON" ] ], 
125
 object := true ) ]
126

127

128
129
In the next example both calls return false:
130
131
 Example 
132

133
gap> a:=NewProcess( "IsPrimeInt", [2^520-1], "localhost", 26133 );
134
< process at localhost:26133 pid=42448 >
135
gap> b:=NewProcess( "IsPrimeInt", [2^15013-1], "localhost", 26134 );
136
< process at localhost:26134 pid=42554 >
137
gap> FirstTrueProcess(a,b); 
138
[ rec( attributes := [ [ "call_id", "localhost:26133:42448:nvsk8PQp" ] ], 
139
 object := false ), 
140
 rec( attributes := [ [ "call_id", "localhost:26134:42554:JnEYuXL8" ] ], 
141
 object := false ) ]
142

143

144
145
146
8.2 MasterWorker skeleton
147
148
In this section we will present more general framework to run parallel
149
computations, which has a number of useful features:
150
151
 it is implemented purely in GAP;
152
153
 the client (i.e. master, which orchestrates the computation) will work
154
in UNIX/Linux, Mac OS X and MS Windows;
155
156
 it may orchestrate both GAP and non-GAP SCSCP servers;
157
158
 if one of servers (i.e. workers) will be lost, it will retry the
159
computation on another available server;
160
161
 it allows to add dynamically new workers during the computation on
162
hostnames and ports from a range perviously declared in SCSCPservers
163
(8.1-3).
164
165
To configure this functionality, the file scscp/configpar.g assigns the
166
global variable SCSCPservers which specifies a list of hosts and ports to
167
search for SCSCP services (which may be not only represented by GAP
168
services, but also by another SCSCP-compliant systems). See comments in this
169
file for further instructions.
170
171
8.2-1 ParListWithSCSCP
172
173
ParListWithSCSCP( listargs, procname )  function
174
Returns: list
175
176
ParListWithSCSCP implements the well-known master-worker skeleton: we have a
177
master (SCSCP client) and a number of workers (SCSCP servers) which obtain
178
pieces of work from the client, perform the required job and report back
179
with the result, waiting for the next job.
180
181
It returns the list of the same length as listargs, i-th element of which is
182
the result of calling the procedure procname with the argument listargs[i].
183
184
It accepts two options which should be given as non-negative integers:
185
timeout which specifies in minutes how long the client must wait for the
186
result (if not given, the default value is one hour) and recallfrequency
187
which specifies the number of iterations after which the search for new
188
services will be performed (if not given the default value is zero meaning
189
no such search at all). There is also a boolean option noretry which, if set
190
to true, means that no retrying calls will be performed if the timeout is
191
exceeded and an incomplete resut may be returned.
192
193
 Example 
194

195
gap> ParListWithSCSCP( List( [2..6], n -> SymmetricGroup(n)), "WS_IdGroup" );
196
#I master -> [ "localhost", 26133 ] : SymmetricGroup( [ 1 .. 2 ] )
197
#I master -> [ "localhost", 26134 ] : SymmetricGroup( [ 1 .. 3 ] )
198
#I [ "localhost", 26133 ] --> master : [ 2, 1 ]
199
#I master -> [ "localhost", 26133 ] : SymmetricGroup( [ 1 .. 4 ] )
200
#I [ "localhost", 26134 ] --> master : [ 6, 1 ]
201
#I master -> [ "localhost", 26134 ] : SymmetricGroup( [ 1 .. 5 ] )
202
#I [ "localhost", 26133 ] --> master : [ 24, 12 ]
203
#I master -> [ "localhost", 26133 ] : SymmetricGroup( [ 1 .. 6 ] )
204
#I [ "localhost", 26133 ] --> master : [ 720, 763 ]
205
#I [ "localhost", 26134 ] --> master : [ 120, 34 ]
206
[ [ 2, 1 ], [ 6, 1 ], [ 24, 12 ], [ 120, 34 ], [ 720, 763 ] ]
207

208

209
210
8.2-2 SCSCPreset
211
212
SCSCPreset( )  function
213
Returns: nothing
214
215
If an error occurs during a call of ParQuickWithSCSCP (8.1-4) and
216
ParListWithSCSCP (8.2-1), some of parallel requests may be still running at
217
the remaining services, making them inaccessible for further procedure
218
calls. SCSCPreset resets them by closing all open streams to SCSCP servers.
219
220
8.2-3 SCSCPLogTracesToGlobal
221
222
SCSCPLogTracesToGlobal( testname )  function
223
SCSCPLogTracesToGlobal( )  function
224
225
To analyse the performance of parallel SCSCP framework, we make use of the
226
EdenTV program [BL07] developed initially to visualize the performance of
227
parallel programs written in functional programming language Eden, and now
228
distributed under the GNU Public License and available from
229
http://www.mathematik.uni-marburg.de/~eden/?content=EdenTV.
230
231
Called with the string containing the name of the test, this functions turns
232
on writing information about key activity events into trace files in current
233
directories for the client and servers listed SCSCPservers (8.1-3). The
234
trace file will have the name of the format testname.client.tr for the
235
client and testname.<hostname>.<port>.tr for the server. After the test
236
these files should be collected from remote servers and concatenated (e.g.
237
using cat) together with the standard preamble from the file
238
scscp/tracing/stdhead.txt (we recommend to put after the preamble first all
239
traces from servers and then the client's traces to have nicer diagrams).
240
The resulting file then may be opened with EdenTV.
241
242
In the following example we use a dual core MacBook laptop to generate trace
243
files for two tests and then show their corresponding trace diagrams:
244
245
 Example 
246

247
SCSCPLogTracesToGlobal("quillen100");
248
ParListWithSCSCP( List( [1..100], i->[512,i]), "QuillenSeriesByIdGroup" );
249
SCSCPLogTracesToGlobal();
250
SCSCPLogTracesToGlobal( "euler" );
251
ParListWithSCSCP( [1..1000], "WS_Phi" );
252
SCSCPLogTracesToGlobal();
253

254

255
256
/See diagrams in HTML and PDF versions of the manual/ The diagrams (made on
257
an dual core MacBook laptop), shows that in the first case parallelising is
258
efficient and master successfully distributes load to workers, while in the
259
second case a single computation is just too short, so most of the time is
260
spent on communication. To parallelize the Euler's function example
261
efficiently, tasks must rather be grouped in chunks, which should be enough
262
large to reduce the communication overload, but enough small to ensure that
263
tasks are evenly distributed.
264
265
Of course, tracing can be used to investigate communication between a client
266
and a single server in a non-parallel context as well. For this purpose,
267
SCSCPservers (8.1-3) must be modified to contain only one server.
268
269
ParListWithSCSCP (8.2-1) can be easily modified to have parallel versions of
270
other list operations like ForAll (Reference: ForAll), ForAny (Reference:
271
ForAny), First (Reference: First), Number (Reference: Number), Filtered
272
(Reference: Filtered), and also to have the skeleton in which the queue may
273
be modified during the computation (for example, to compute orbits). We plan
274
to provide such tools in one of the next versions of the package.
275
276
277
8.3 Example: parallelising Karatsuba multiplication for polynomials
278
279
The file scscp/example/karatsuba.g contains an implementation of the
280
Karatsuba multiplication algorithm for polynomials. This algorithm can be
281
easily parallelized since each recursive step creates three recursive calls
282
of the same function for other polynomials. We will not parallelize each
283
recursive call, since this will create enormous data flow. Instead of this
284
we parallelize only the top-level function. For our experiments with
285
parallelising Karatsuba multiplication for polynomials with integer
286
coefficients we used the multi-core workstation, on which we started one
287
SCSCP client and two SCSCP servers. To use it, modify the server
288
configuration file adding to it the command to read the file
289
scscp/example/karatsuba.g, then define there the following function
290
291
 Example 
292

293
KaratsubaPolynomialMultiplicationExtRepByString:=function(s1,s2)
294
 return String( KaratsubaPolynomialMultiplicationExtRep( 
295
 EvalString(s1), EvalString(s2) ) );
296
end;;
297

298

299
300
and finally add the following lines to made it available as an SCSCP
301
procedure under the name WS_Karatsuba:
302
303
 Example 
304

305
InstallSCSCPprocedure( "WS_Karatsuba", 
306
 KaratsubaPolynomialMultiplicationExtRepByString);
307

308

309
310
(we do not include it into the default scscp/example/myserver.g since the
311
code contains a call to EvalString (Reference: EvalString)).
312
313
This function provides a "bridge" between the client's function
314
KaratsubaPolynomialMultiplicationWS and the server's function
315
KaratsubaPolynomialMultiplicationExtRep, which performs the actual work on
316
the server. WS_Karatsuba converts its string arguments into internal
317
representation of univariate polynomials (basically, lists of integers) and
318
then converts the result back into string (since such data exchange format
319
was chosen). We are going to parallelize the following part of the client's
320
code:
321
322
 Example 
323

324
...
325
u := KaratsubaPolynomialMultiplicationExtRep(f1,g1);
326
v := KaratsubaPolynomialMultiplicationExtRep(f0,g0);
327
w := KaratsubaPolynomialMultiplicationExtRep(
328
 PlusLaurentPolynomialsExtRep(f1,f0),
329
 PlusLaurentPolynomialsExtRep(g1,g0) );
330
...
331

332

333
334
and this can be done straightforwardly - we replace two first calls by calls
335
of the appropriate SCSCP services, then perform the 3rd call locally and
336
then collect the results from the two remote calls:
337
338
 Example 
339

340
...
341
u := NewProcess( "WS_Karatsuba",[ String(f1), String(g1) ],"localhost", 26133); 
342
v := NewProcess( "WS_Karatsuba",[ String(f0), String(g0) ],"localhost", 26134); 
343
w := KaratsubaPolynomialMultiplicationExtRep(
344
 PlusLaurentPolynomialsExtRep(f1,f0),
345
 PlusLaurentPolynomialsExtRep(g1,g0) );
346
wsresult:=SynchronizeProcesses2( u,v );
347
u := EvalString( wsresult[1].object );
348
v := EvalString( wsresult[2].object );
349
...
350

351

352
353
We obtain almost double speedup on three cores on randomly generated
354
polynomials of degree 32000:
355
356
 Example 
357

358
gap> ReadPackage("scscp/example/karatsuba.g");
359
gap> fam:=FamilyObj(1);;
360
gap> f:=LaurentPolynomialByCoefficients( fam, 
361
>  List([1..32000],i->Random(Integers)), 0, 1 );;
362
gap> g:=LaurentPolynomialByCoefficients( fam, 
363
>  List([1..32000],i->Random(Integers)), 0, 1 );;
364
gap> t2:=KaratsubaPolynomialMultiplication(f,g);;time;
365
5892
366
gap> t3:=KaratsubaPolynomialMultiplicationWS(f,g);;time;
367
2974
368

369

370
371
372