Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
allendowney
GitHub Repository: allendowney/cpython
Path: blob/main/Modules/_decimal/libmpdec/literature/mulmod-ppro.txt
12 views
1
2
3
(* Copyright (c) 2011-2020 Stefan Krah. All rights reserved. *)
4
5
6
========================================================================
7
Calculate (a * b) % p using the 80-bit x87 FPU
8
========================================================================
9
10
A description of the algorithm can be found in the apfloat manual by
11
Tommila [1].
12
13
The proof follows an argument made by Granlund/Montgomery in [2].
14
15
16
Definitions and assumptions:
17
----------------------------
18
19
The 80-bit extended precision format uses 64 bits for the significand:
20
21
(1) F = 64
22
23
The modulus is prime and less than 2**31:
24
25
(2) 2 <= p < 2**31
26
27
The factors are less than p:
28
29
(3) 0 <= a < p
30
(4) 0 <= b < p
31
32
The product a * b is less than 2**62 and is thus exact in 64 bits:
33
34
(5) n = a * b
35
36
The product can be represented in terms of quotient and remainder:
37
38
(6) n = q * p + r
39
40
Using (3), (4) and the fact that p is prime, the remainder is always
41
greater than zero:
42
43
(7) 0 <= q < p /\ 1 <= r < p
44
45
46
Strategy:
47
---------
48
49
Precalculate the 80-bit long double inverse of p, with a maximum
50
relative error of 2**(1-F):
51
52
(8) pinv = (long double)1.0 / p
53
54
Calculate an estimate for q = floor(n/p). The multiplication has another
55
maximum relative error of 2**(1-F):
56
57
(9) qest = n * pinv
58
59
If we can show that q < qest < q+1, then trunc(qest) = q. It is then
60
easy to recover the remainder r. The complete algorithm is:
61
62
a) Set the control word to 64-bit precision and truncation mode.
63
64
b) n = a * b # Calculate exact product.
65
66
c) qest = n * pinv # Calculate estimate for the quotient.
67
68
d) q = (qest+2**63)-2**63 # Truncate qest to the exact quotient.
69
70
f) r = n - q * p # Calculate remainder.
71
72
73
Proof for q < qest < q+1:
74
-------------------------
75
76
Using the cumulative error, the error bounds for qest are:
77
78
n n * (1 + 2**(1-F))**2
79
(9) --------------------- <= qest <= ---------------------
80
p * (1 + 2**(1-F))**2 p
81
82
83
Lemma 1:
84
--------
85
n q * p + r
86
(10) q < --------------------- = ---------------------
87
p * (1 + 2**(1-F))**2 p * (1 + 2**(1-F))**2
88
89
90
Proof:
91
~~~~~~
92
93
(I) q * p * (1 + 2**(1-F))**2 < q * p + r
94
95
(II) q * p * 2**(2-F) + q * p * 2**(2-2*F) < r
96
97
Using (1) and (7), it is sufficient to show that:
98
99
(III) q * p * 2**(-62) + q * p * 2**(-126) < 1 <= r
100
101
(III) can easily be verified by substituting the largest possible
102
values p = 2**31-1 and q = 2**31-2.
103
104
The critical cases occur when r = 1, n = m * p + 1. These cases
105
can be exhaustively verified with a test program.
106
107
108
Lemma 2:
109
--------
110
111
n * (1 + 2**(1-F))**2 (q * p + r) * (1 + 2**(1-F))**2
112
(11) --------------------- = ------------------------------- < q + 1
113
p p
114
115
Proof:
116
~~~~~~
117
118
(I) (q * p + r) + (q * p + r) * 2**(2-F) + (q * p + r) * 2**(2-2*F) < q * p + p
119
120
(II) (q * p + r) * 2**(2-F) + (q * p + r) * 2**(2-2*F) < p - r
121
122
Using (1) and (7), it is sufficient to show that:
123
124
(III) (q * p + r) * 2**(-62) + (q * p + r) * 2**(-126) < 1 <= p - r
125
126
(III) can easily be verified by substituting the largest possible
127
values p = 2**31-1, q = 2**31-2 and r = 2**31-2.
128
129
The critical cases occur when r = (p - 1), n = m * p - 1. These cases
130
can be exhaustively verified with a test program.
131
132
133
[1] http://www.apfloat.org/apfloat/2.40/apfloat.pdf
134
[2] http://gmplib.org/~tege/divcnst-pldi94.pdf
135
[Section 7: "Use of floating point"]
136
137
138
139
(* Coq proof for (10) and (11) *)
140
141
Require Import ZArith.
142
Require Import QArith.
143
Require Import Qpower.
144
Require Import Qabs.
145
Require Import Psatz.
146
147
Open Scope Q_scope.
148
149
150
Ltac qreduce T :=
151
rewrite <- (Qred_correct (T)); simpl (Qred (T)).
152
153
Theorem Qlt_move_right :
154
forall x y z:Q, x + z < y <-> x < y - z.
155
Proof.
156
intros.
157
split.
158
intros.
159
psatzl Q.
160
intros.
161
psatzl Q.
162
Qed.
163
164
Theorem Qlt_mult_by_z :
165
forall x y z:Q, 0 < z -> (x < y <-> x * z < y * z).
166
Proof.
167
intros.
168
split.
169
intros.
170
apply Qmult_lt_compat_r. trivial. trivial.
171
intros.
172
rewrite <- (Qdiv_mult_l x z). rewrite <- (Qdiv_mult_l y z).
173
apply Qmult_lt_compat_r.
174
apply Qlt_shift_inv_l.
175
trivial. psatzl Q. trivial. psatzl Q. psatzl Q.
176
Qed.
177
178
Theorem Qle_mult_quad :
179
forall (a b c d:Q),
180
0 <= a -> a <= c ->
181
0 <= b -> b <= d ->
182
a * b <= c * d.
183
intros.
184
psatz Q.
185
Qed.
186
187
188
Theorem q_lt_qest:
189
forall (p q r:Q),
190
(0 < p) -> (p <= (2#1)^31 - 1) ->
191
(0 <= q) -> (q <= p - 1) ->
192
(1 <= r) -> (r <= p - 1) ->
193
q < (q * p + r) / (p * (1 + (2#1)^(-63))^2).
194
Proof.
195
intros.
196
rewrite Qlt_mult_by_z with (z := (p * (1 + (2#1)^(-63))^2)).
197
198
unfold Qdiv.
199
rewrite <- Qmult_assoc.
200
rewrite (Qmult_comm (/ (p * (1 + (2 # 1) ^ (-63)) ^ 2)) (p * (1 + (2 # 1) ^ (-63)) ^ 2)).
201
rewrite Qmult_inv_r.
202
rewrite Qmult_1_r.
203
204
assert (q * (p * (1 + (2 # 1) ^ (-63)) ^ 2) == q * p + (q * p) * ((2 # 1) ^ (-62) + (2 # 1) ^ (-126))).
205
qreduce ((1 + (2 # 1) ^ (-63)) ^ 2).
206
qreduce ((2 # 1) ^ (-62) + (2 # 1) ^ (-126)).
207
ring_simplify.
208
reflexivity.
209
rewrite H5.
210
211
rewrite Qplus_comm.
212
rewrite Qlt_move_right.
213
ring_simplify (q * p + r - q * p).
214
qreduce ((2 # 1) ^ (-62) + (2 # 1) ^ (-126)).
215
216
apply Qlt_le_trans with (y := 1).
217
rewrite Qlt_mult_by_z with (z := 85070591730234615865843651857942052864 # 18446744073709551617).
218
ring_simplify.
219
220
apply Qle_lt_trans with (y := ((2 # 1) ^ 31 - (2#1)) * ((2 # 1) ^ 31 - 1)).
221
apply Qle_mult_quad.
222
assumption. psatzl Q. psatzl Q. psatzl Q. psatzl Q. psatzl Q. assumption. psatzl Q. psatzl Q.
223
Qed.
224
225
Theorem qest_lt_qplus1:
226
forall (p q r:Q),
227
(0 < p) -> (p <= (2#1)^31 - 1) ->
228
(0 <= q) -> (q <= p - 1) ->
229
(1 <= r) -> (r <= p - 1) ->
230
((q * p + r) * (1 + (2#1)^(-63))^2) / p < q + 1.
231
Proof.
232
intros.
233
rewrite Qlt_mult_by_z with (z := p).
234
235
unfold Qdiv.
236
rewrite <- Qmult_assoc.
237
rewrite (Qmult_comm (/ p) p).
238
rewrite Qmult_inv_r.
239
rewrite Qmult_1_r.
240
241
assert ((q * p + r) * (1 + (2 # 1) ^ (-63)) ^ 2 == q * p + r + (q * p + r) * ((2 # 1) ^ (-62) + (2 # 1) ^ (-126))).
242
qreduce ((1 + (2 # 1) ^ (-63)) ^ 2).
243
qreduce ((2 # 1) ^ (-62) + (2 # 1) ^ (-126)).
244
ring_simplify. reflexivity.
245
rewrite H5.
246
247
rewrite <- Qplus_assoc. rewrite <- Qplus_comm. rewrite Qlt_move_right.
248
ring_simplify ((q + 1) * p - q * p).
249
250
rewrite <- Qplus_comm. rewrite Qlt_move_right.
251
252
apply Qlt_le_trans with (y := 1).
253
qreduce ((2 # 1) ^ (-62) + (2 # 1) ^ (-126)).
254
255
rewrite Qlt_mult_by_z with (z := 85070591730234615865843651857942052864 # 18446744073709551617).
256
ring_simplify.
257
258
ring_simplify in H0.
259
apply Qle_lt_trans with (y := (2147483646 # 1) * (2147483647 # 1) + (2147483646 # 1)).
260
261
apply Qplus_le_compat.
262
apply Qle_mult_quad.
263
assumption. psatzl Q. auto with qarith. assumption. psatzl Q.
264
auto with qarith. auto with qarith.
265
psatzl Q. psatzl Q. assumption.
266
Qed.
267
268
269
270
271