Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
sagemath
GitHub Repository: sagemath/sagesmc
Path: blob/master/src/sage/finance/time_series.pyx
8815 views
1
"""
2
Time Series
3
4
This is a module for working with discrete floating point time series.
5
It is designed so that every operation is very fast, typically much
6
faster than with other generic code, e.g., Python lists of doubles or
7
even NumPy arrays. The semantics of time series is more similar to
8
Python lists of doubles than Sage real double vectors or NumPy 1-D
9
arrays. In particular, time series are not endowed with much
10
algebraic structure and are always mutable.
11
12
.. NOTE::
13
14
NumPy arrays are faster at slicing, since slices return
15
references, and NumPy arrays have strides. However, this speed at
16
slicing makes NumPy slower at certain other operations.
17
18
EXAMPLES::
19
20
sage: set_random_seed(1)
21
sage: t = finance.TimeSeries([random()-0.5 for _ in xrange(10)]); t
22
[0.3294, 0.0959, -0.0706, -0.4646, 0.4311, 0.2275, -0.3840, -0.3528, -0.4119, -0.2933]
23
sage: t.sums()
24
[0.3294, 0.4253, 0.3547, -0.1099, 0.3212, 0.5487, 0.1647, -0.1882, -0.6001, -0.8933]
25
sage: t.exponential_moving_average(0.7)
26
[0.0000, 0.3294, 0.1660, 0.0003, -0.3251, 0.2042, 0.2205, -0.2027, -0.3078, -0.3807]
27
sage: t.standard_deviation()
28
0.33729638212891383
29
sage: t.mean()
30
-0.08933425506929439
31
sage: t.variance()
32
0.1137688493972542...
33
34
AUTHOR:
35
36
- William Stein
37
"""
38
39
include "sage/ext/cdefs.pxi"
40
include "sage/ext/stdsage.pxi"
41
include "sage/ext/random.pxi"
42
include "sage/ext/python_slice.pxi"
43
from cpython.string cimport *
44
45
cdef extern from "math.h":
46
double exp(double)
47
double floor(double)
48
double log(double)
49
double pow(double, double)
50
double sqrt(double)
51
52
cdef extern from "string.h":
53
void* memcpy(void* dst, void* src, size_t len)
54
55
cimport numpy as cnumpy
56
57
from sage.rings.integer import Integer
58
from sage.rings.real_double import RDF
59
from sage.modules.vector_real_double_dense cimport Vector_real_double_dense
60
61
max_print = 10
62
digits = 4
63
64
cdef class TimeSeries:
65
def __cinit__(self):
66
"""
67
Create new empty uninitialized time series.
68
69
EXAMPLES:
70
71
This implicitly calls new::
72
73
sage: finance.TimeSeries([1,3,-4,5])
74
[1.0000, 3.0000, -4.0000, 5.0000]
75
"""
76
self._values = NULL
77
78
def __init__(self, values, bint initialize=True):
79
"""
80
Initialize new time series.
81
82
INPUT:
83
84
- ``values`` -- integer (number of values) or an iterable of
85
floats.
86
87
- ``initialize`` -- bool (default: ``True``); if ``False``, do not
88
bother to zero out the entries of the new time series.
89
For large series that you are going to just fill in,
90
this can be way faster.
91
92
EXAMPLES:
93
94
This implicitly calls init::
95
96
sage: finance.TimeSeries([pi, 3, 18.2])
97
[3.1416, 3.0000, 18.2000]
98
99
Conversion from a NumPy 1-D array, which is very fast::
100
101
sage: v = finance.TimeSeries([1..5])
102
sage: w = v.numpy()
103
sage: finance.TimeSeries(w)
104
[1.0000, 2.0000, 3.0000, 4.0000, 5.0000]
105
106
Conversion from an n-dimensional NumPy array also works::
107
108
sage: import numpy
109
sage: v = numpy.array([[1,2], [3,4]], dtype=float); v
110
array([[ 1., 2.],
111
[ 3., 4.]])
112
sage: finance.TimeSeries(v)
113
[1.0000, 2.0000, 3.0000, 4.0000]
114
sage: finance.TimeSeries(v[:,0])
115
[1.0000, 3.0000]
116
sage: u = numpy.array([[1,2],[3,4]])
117
sage: finance.TimeSeries(u)
118
[1.0000, 2.0000, 3.0000, 4.0000]
119
120
For speed purposes we don't initialize (so value is garbage)::
121
122
sage: t = finance.TimeSeries(10, initialize=False)
123
"""
124
cdef Vector_real_double_dense z
125
cdef cnumpy.ndarray np
126
cdef double *np_data
127
cdef unsigned int j
128
if isinstance(values, (int, long, Integer)):
129
self._length = values
130
values = None
131
elif PY_TYPE_CHECK(values, Vector_real_double_dense) or PY_TYPE_CHECK(values, cnumpy.ndarray):
132
if PY_TYPE_CHECK(values, Vector_real_double_dense):
133
np = values._vector_numpy
134
else:
135
np = values
136
137
if np.ndim != 1:
138
np = np.reshape([np.size])
139
140
# Make the array be the correct type and have a C array
141
# for a data structure. If the array already is the
142
# correct type and has a C array, nothing is done, so this
143
# should be fast in the common case.
144
np = np.astype('double')
145
np = cnumpy.PyArray_GETCONTIGUOUS(np)
146
np_data = <double*> cnumpy.PyArray_DATA(np)
147
self._length = np.shape[0]
148
self._values = <double*> sage_malloc(sizeof(double) * self._length)
149
if self._values == NULL:
150
raise MemoryError
151
152
memcpy(self._values, np_data, sizeof(double)*self._length)
153
return
154
else:
155
values = [float(x) for x in values]
156
self._length = len(values)
157
158
self._values = <double*> sage_malloc(sizeof(double) * self._length)
159
if self._values == NULL:
160
raise MemoryError
161
if not initialize: return
162
cdef Py_ssize_t i
163
if values is not None:
164
for i from 0 <= i < self._length:
165
self._values[i] = values[i]
166
else:
167
for i from 0 <= i < self._length:
168
self._values[i] = 0
169
170
def __reduce__(self):
171
"""
172
Used in pickling time series.
173
174
EXAMPLES::
175
176
sage: v = finance.TimeSeries([1,-3.5])
177
sage: v.__reduce__()
178
(<built-in function unpickle_time_series_v1>, (..., 2))
179
sage: loads(dumps(v)) == v
180
True
181
182
Note that dumping and loading with compress ``False`` is much faster,
183
though dumping with compress ``True`` can save a lot of space::
184
185
sage: v = finance.TimeSeries([1..10^5])
186
sage: loads(dumps(v, compress=False),compress=False) == v
187
True
188
"""
189
buf = PyString_FromStringAndSize(<char*>self._values, self._length*sizeof(double)/sizeof(char))
190
return unpickle_time_series_v1, (buf, self._length)
191
192
def __cmp__(self, _other):
193
"""
194
Compare ``self`` and ``other``. This has the same semantics
195
as list comparison.
196
197
EXAMPLES::
198
199
sage: v = finance.TimeSeries([1,2,3]); w = finance.TimeSeries([1,2])
200
sage: v < w
201
False
202
sage: w < v
203
True
204
sage: v == v
205
True
206
sage: w == w
207
True
208
"""
209
cdef TimeSeries other
210
cdef Py_ssize_t c, i
211
cdef double d
212
if not isinstance(_other, TimeSeries):
213
_other = TimeSeries(_other)
214
other = _other
215
for i from 0 <= i < min(self._length, other._length):
216
d = self._values[i] - other._values[i]
217
if d:
218
return -1 if d < 0 else 1
219
c = self._length - other._length
220
if c < 0:
221
return -1
222
elif c > 0:
223
return 1
224
return 0
225
226
def __dealloc__(self):
227
"""
228
Free up memory used by a time series.
229
230
EXAMPLES:
231
232
This tests ``__dealloc__`` implicitly::
233
234
sage: v = finance.TimeSeries([1,3,-4,5])
235
sage: del v
236
"""
237
if self._values:
238
sage_free(self._values)
239
240
def vector(self):
241
"""
242
Return real double vector whose entries are the values of this
243
time series. This is useful since vectors have standard
244
algebraic structure and play well with matrices.
245
246
OUTPUT:
247
248
A real double vector.
249
250
EXAMPLES::
251
252
sage: v = finance.TimeSeries([1..10])
253
sage: v.vector()
254
(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0)
255
"""
256
V = RDF**self._length
257
# A copy of the numpy array is made in the vector constructor
258
cdef Vector_real_double_dense x = Vector_real_double_dense(V, self.numpy(copy=False))
259
return x
260
261
def __repr__(self):
262
"""
263
Return string representation of ``self``.
264
265
EXAMPLES::
266
267
sage: v = finance.TimeSeries([1,3.1908439,-4,5.93932])
268
sage: v.__repr__()
269
'[1.0000, 3.1908, -4.0000, 5.9393]'
270
271
By default 4 digits after the decimal point are displayed. To
272
change this, change ``self.finance.time_series.digits``. ::
273
274
sage: sage.finance.time_series.digits = 2
275
sage: v.__repr__()
276
'[1.00, 3.19, -4.00, 5.94]'
277
sage: v
278
[1.00, 3.19, -4.00, 5.94]
279
sage: sage.finance.time_series.digits = 4
280
sage: v
281
[1.0000, 3.1908, -4.0000, 5.9393]
282
"""
283
return self._repr()
284
285
def _repr(self, prec=None):
286
"""
287
Print representation of a time series.
288
289
INPUT:
290
291
- ``prec`` -- (default: ``None``) number of digits of precision or
292
``None``. If ``None`` use the default
293
``sage.finance.time_series.digits``.
294
295
OUTPUT:
296
297
A string.
298
299
EXAMPLES::
300
301
sage: v = finance.TimeSeries([1,3.1908439,-4,5.93932])
302
sage: v._repr()
303
'[1.0000, 3.1908, -4.0000, 5.9393]'
304
sage: v._repr(10)
305
'[1.0000000000, 3.1908439000, -4.0000000000, 5.9393200000]'
306
sage: v._repr(2)
307
'[1.00, 3.19, -4.00, 5.94]'
308
"""
309
if prec is None: prec = digits
310
format = '%.' + str(prec) + 'f'
311
if len(self) > max_print:
312
v0 = self[:max_print//2]
313
v1 = self[-max_print//2:]
314
return '[' + ', '.join([format%x for x in v0]) + ' ... ' + \
315
', '.join([format%x for x in v1]) + ']'
316
else:
317
return '[' + ', '.join([format%x for x in self]) + ']'
318
319
def __len__(self):
320
"""
321
Return the number of entries in this time series.
322
323
OUTPUT:
324
325
Python integer.
326
327
EXAMPLES::
328
329
sage: v = finance.TimeSeries([1,3.1908439,-4,5.93932])
330
sage: v.__len__()
331
4
332
sage: len(v)
333
4
334
"""
335
return self._length
336
337
def __getitem__(self, i):
338
"""
339
Return i-th entry or slice of ``self``.
340
341
EXAMPLES::
342
343
sage: v = finance.TimeSeries([1,-4,3,-2.5,-4,3])
344
sage: v[2]
345
3.0
346
sage: v[-1]
347
3.0
348
sage: v[-10]
349
Traceback (most recent call last):
350
...
351
IndexError: TimeSeries index out of range
352
sage: v[5]
353
3.0
354
sage: v[6]
355
Traceback (most recent call last):
356
...
357
IndexError: TimeSeries index out of range
358
359
Some slice examples::
360
361
sage: v[-3:]
362
[-2.5000, -4.0000, 3.0000]
363
sage: v[-3:-1]
364
[-2.5000, -4.0000]
365
sage: v[::2]
366
[1.0000, 3.0000, -4.0000]
367
sage: v[3:20]
368
[-2.5000, -4.0000, 3.0000]
369
sage: v[3:2]
370
[]
371
372
Make a copy::
373
374
sage: v[:]
375
[1.0000, -4.0000, 3.0000, -2.5000, -4.0000, 3.0000]
376
377
Reverse the time series::
378
379
sage: v[::-1]
380
[3.0000, -4.0000, -2.5000, 3.0000, -4.0000, 1.0000]
381
"""
382
cdef Py_ssize_t start, stop, step, j
383
cdef TimeSeries t
384
if PySlice_Check(i):
385
start = 0 if (i.start is None) else i.start
386
stop = self._length if (i.stop is None) else i.stop
387
step = 1 if (i.step is None) else i.step
388
if start < 0:
389
start += self._length
390
if start < 0: start = 0
391
elif start >= self._length:
392
start = self._length - 1
393
if stop < 0:
394
stop += self._length
395
if stop < 0: stop = 0
396
elif stop > self._length:
397
stop = self._length
398
if start >= stop:
399
return new_time_series(0)
400
if step < 0:
401
step = -step
402
t = new_time_series((stop-start)/step)
403
for j from 0 <= j < (stop-start)/step:
404
t._values[j] = self._values[stop-1 - j*step]
405
elif step > 1:
406
t = new_time_series((stop-start)/step)
407
for j from 0 <= j < (stop-start)/step:
408
t._values[j] = self._values[j*step+start]
409
else:
410
t = new_time_series(stop-start)
411
memcpy(t._values, self._values + start, sizeof(double)*t._length)
412
return t
413
else:
414
j = i
415
if j < 0:
416
j += self._length
417
if j < 0:
418
raise IndexError, "TimeSeries index out of range"
419
elif j >= self._length:
420
raise IndexError, "TimeSeries index out of range"
421
return self._values[j]
422
423
def __setitem__(self, Py_ssize_t i, double x):
424
"""
425
Set the i-th entry of ``self`` to ``x``.
426
427
INPUT:
428
429
- i -- a nonnegative integer.
430
431
- x -- a float.
432
433
EXAMPLES::
434
435
sage: v = finance.TimeSeries([1,3,-4,5.93932]); v
436
[1.0000, 3.0000, -4.0000, 5.9393]
437
sage: v[0] = -5.5; v
438
[-5.5000, 3.0000, -4.0000, 5.9393]
439
sage: v[-1] = 3.2; v
440
[-5.5000, 3.0000, -4.0000, 3.2000]
441
sage: v[10]
442
Traceback (most recent call last):
443
...
444
IndexError: TimeSeries index out of range
445
sage: v[-5]
446
Traceback (most recent call last):
447
...
448
IndexError: TimeSeries index out of range
449
"""
450
if i < 0:
451
i += self._length
452
if i < 0:
453
raise IndexError, "TimeSeries index out of range"
454
elif i >= self._length:
455
raise IndexError, "TimeSeries index out of range"
456
self._values[i] = x
457
458
def __copy__(self):
459
"""
460
Return a copy of ``self``.
461
462
EXAMPLES::
463
464
sage: v = finance.TimeSeries([1,-4,3,-2.5,-4,3])
465
sage: v.__copy__()
466
[1.0000, -4.0000, 3.0000, -2.5000, -4.0000, 3.0000]
467
sage: copy(v)
468
[1.0000, -4.0000, 3.0000, -2.5000, -4.0000, 3.0000]
469
sage: copy(v) is v
470
False
471
"""
472
cdef Py_ssize_t i
473
cdef TimeSeries t = new_time_series(self._length)
474
memcpy(t._values, self._values , sizeof(double)*self._length)
475
return t
476
477
def __add__(left, right):
478
"""
479
Concatenate the time series ``self`` and ``right``.
480
481
.. NOTE::
482
483
To add a single number to the entries of a time series,
484
use the ``add_scalar`` method, and to add componentwise use
485
the ``add_entries`` method.
486
487
INPUT:
488
489
- ``right`` -- a time series.
490
491
OUTPUT:
492
493
A time series.
494
495
EXAMPLES::
496
497
sage: v = finance.TimeSeries([1,2,3]); w = finance.TimeSeries([1,2])
498
sage: v + w
499
[1.0000, 2.0000, 3.0000, 1.0000, 2.0000]
500
sage: v = finance.TimeSeries([1,2,-5]); v
501
[1.0000, 2.0000, -5.0000]
502
503
Note that both summands must be a time series::
504
505
sage: v + xrange(4)
506
Traceback (most recent call last):
507
...
508
TypeError: right operand must be a time series
509
sage: [1,5] + v
510
Traceback (most recent call last):
511
...
512
TypeError: left operand must be a time series
513
"""
514
if not isinstance(right, TimeSeries):
515
raise TypeError, "right operand must be a time series"
516
if not isinstance(left, TimeSeries):
517
raise TypeError, "left operand must be a time series"
518
cdef TimeSeries R = right
519
cdef TimeSeries L = left
520
cdef TimeSeries t = new_time_series(L._length + R._length)
521
memcpy(t._values, L._values, sizeof(double)*L._length)
522
memcpy(t._values + L._length, R._values, sizeof(double)*R._length)
523
return t
524
525
def __mul__(left, right):
526
"""
527
Multiply a time series by an integer n, which (like for lists)
528
results in the time series concatenated with itself n times.
529
530
.. NOTE::
531
532
To multiply all the entries of a time series by a single
533
scalar, use the ``scale`` method.
534
535
INPUT:
536
537
- ``left``, ``right`` -- an integer and a time series.
538
539
OUTPUT:
540
541
A time series.
542
543
EXAMPLES::
544
545
sage: v = finance.TimeSeries([1,2,-5]); v
546
[1.0000, 2.0000, -5.0000]
547
sage: v*3
548
[1.0000, 2.0000, -5.0000, 1.0000, 2.0000, -5.0000, 1.0000, 2.0000, -5.0000]
549
sage: 3*v
550
[1.0000, 2.0000, -5.0000, 1.0000, 2.0000, -5.0000, 1.0000, 2.0000, -5.0000]
551
sage: v*v
552
Traceback (most recent call last):
553
...
554
TypeError: 'sage.finance.time_series.TimeSeries' object cannot be interpreted as an index
555
"""
556
cdef Py_ssize_t n, i
557
cdef TimeSeries T
558
if isinstance(left, TimeSeries):
559
T = left
560
n = right
561
else:
562
T = right
563
n = left
564
# Make n copies of T concatenated together
565
cdef TimeSeries v = new_time_series(T._length * n)
566
for i from 0 <= i < n:
567
memcpy(v._values + i*T._length, T._values, sizeof(double)*T._length)
568
return v
569
570
571
def autoregressive_fit(self,M):
572
r"""
573
This method fits the time series to an autoregressive process
574
of order ``M``. That is, we assume the process is given by
575
`X_t-\mu=a_1(X_{t-1}-\mu)+a_2(X_{t-1}-\mu)+\cdots+a_M(X_{t-M}-\mu)+Z_t`
576
where `\mu` is the mean of the process and `Z_t` is noise.
577
This method returns estimates for `a_1,\dots,a_M`.
578
579
The method works by solving the Yule-Walker equations
580
`\Gamma a =\gamma`, where `\gamma=(\gamma(1),\dots,\gamma(M))`,
581
`a=(a_1,\dots,a_M)` with `\gamma(i)` the autocovariance of lag `i`
582
and `\Gamma_{ij}=\gamma(i-j)`.
583
584
585
.. WARNING::
586
587
The input sequence is assumed to be stationary, which
588
means that the autocovariance `\langle y_j y_k \rangle` depends
589
only on the difference `|j-k|`.
590
591
INPUT:
592
593
- ``M`` -- an integer.
594
595
OUTPUT:
596
597
A time series -- the coefficients of the autoregressive process.
598
599
EXAMPLES::
600
601
sage: set_random_seed(0)
602
sage: v = finance.TimeSeries(10^4).randomize('normal').sums()
603
sage: F = v.autoregressive_fit(100)
604
sage: v
605
[0.6767, 0.2756, 0.6332, 0.0469, -0.8897 ... 87.6759, 87.6825, 87.4120, 87.6639, 86.3194]
606
sage: v.autoregressive_forecast(F)
607
86.0177285042...
608
sage: F
609
[1.0148, -0.0029, -0.0105, 0.0067, -0.0232 ... -0.0106, -0.0068, 0.0085, -0.0131, 0.0092]
610
611
sage: set_random_seed(0)
612
sage: t=finance.TimeSeries(2000)
613
sage: z=finance.TimeSeries(2000)
614
sage: z.randomize('normal',1)
615
[1.6767, 0.5989, 1.3576, 0.4136, 0.0635 ... 1.0057, -1.1467, 1.2809, 1.5705, 1.1095]
616
sage: t[0]=1
617
sage: t[1]=2
618
sage: for i in range(2,2000):
619
... t[i]=t[i-1]-0.5*t[i-2]+z[i]
620
...
621
sage: c=t[0:-1].autoregressive_fit(2) #recovers recurrence relation
622
sage: c #should be close to [1,-0.5]
623
[1.0371, -0.5199]
624
"""
625
acvs = [self.autocovariance(i) for i in range(M+1)]
626
return autoregressive_fit(acvs)
627
628
def autoregressive_forecast(self, filter):
629
"""
630
Given the autoregression coefficients as outputted by the
631
``autoregressive_fit`` command, compute the forecast for the next
632
term in the series.
633
634
INPUT:
635
636
- ``filter`` -- a time series outputted by the ``autoregressive_fit``
637
command.
638
639
EXAMPLES::
640
641
sage: set_random_seed(0)
642
sage: v = finance.TimeSeries(100).randomize('normal').sums()
643
sage: F = v[:-1].autoregressive_fit(5); F
644
[1.0019, -0.0524, -0.0643, 0.1323, -0.0539]
645
sage: v.autoregressive_forecast(F)
646
11.7820298611...
647
sage: v
648
[0.6767, 0.2756, 0.6332, 0.0469, -0.8897 ... 9.2447, 9.6709, 10.4037, 10.4836, 12.1960]
649
"""
650
cdef TimeSeries filt
651
if isinstance(filter, TimeSeries):
652
filt = filter
653
else:
654
filt = TimeSeries(filter)
655
656
cdef double f = 0
657
cdef Py_ssize_t i
658
for i from 0 <= i < min(self._length, filt._length):
659
f += self._values[self._length - i - 1] * filt._values[i]
660
return f
661
662
def reversed(self):
663
"""
664
Return new time series obtain from this time series by
665
reversing the order of the entries in this time series.
666
667
OUTPUT:
668
669
A time series.
670
671
EXAMPLES::
672
673
sage: v = finance.TimeSeries([1..5])
674
sage: v.reversed()
675
[5.0000, 4.0000, 3.0000, 2.0000, 1.0000]
676
"""
677
cdef Py_ssize_t i, n = self._length-1
678
cdef TimeSeries t = new_time_series(self._length)
679
680
for i from 0 <= i < self._length:
681
t._values[i] = self._values[n - i]
682
return t
683
684
def extend(self, right):
685
"""
686
Extend this time series by appending elements from the iterable
687
``right``.
688
689
INPUT:
690
691
- ``right`` -- iterable that can be converted to a time series.
692
693
EXAMPLES::
694
695
sage: v = finance.TimeSeries([1,2,-5]); v
696
[1.0000, 2.0000, -5.0000]
697
sage: v.extend([-3.5, 2])
698
sage: v
699
[1.0000, 2.0000, -5.0000, -3.5000, 2.0000]
700
"""
701
if not isinstance(right, TimeSeries):
702
right = TimeSeries(right)
703
if len(right) == 0:
704
return
705
cdef TimeSeries T = right
706
cdef double* z = <double*> sage_malloc(sizeof(double)*(self._length + T._length))
707
if z == NULL:
708
raise MemoryError
709
memcpy(z, self._values, sizeof(double)*self._length)
710
memcpy(z + self._length, T._values, sizeof(double)*T._length)
711
sage_free(self._values)
712
self._values = z
713
self._length = self._length + T._length
714
715
def list(self):
716
"""
717
Return list of elements of ``self``.
718
719
EXAMPLES::
720
721
sage: v = finance.TimeSeries([1,-4,3,-2.5,-4,3])
722
sage: v.list()
723
[1.0, -4.0, 3.0, -2.5, -4.0, 3.0]
724
"""
725
cdef Py_ssize_t i
726
return [self._values[i] for i in range(self._length)]
727
728
def log(self):
729
"""
730
Return new time series got by taking the logarithms of all the
731
terms in the time series.
732
733
OUTPUT:
734
735
A new time series.
736
737
EXAMPLES:
738
739
We exponentiate then log a time series and get back
740
the original series::
741
742
sage: v = finance.TimeSeries([1,-4,3,-2.5,-4,3]); v
743
[1.0000, -4.0000, 3.0000, -2.5000, -4.0000, 3.0000]
744
sage: v.exp()
745
[2.7183, 0.0183, 20.0855, 0.0821, 0.0183, 20.0855]
746
sage: v.exp().log()
747
[1.0000, -4.0000, 3.0000, -2.5000, -4.0000, 3.0000]
748
749
Log of 0 gives ``-inf``::
750
751
sage: finance.TimeSeries([1,0,3]).log()[1]
752
-inf
753
"""
754
cdef Py_ssize_t i
755
cdef TimeSeries t = new_time_series(self._length)
756
for i from 0 <= i < self._length:
757
t._values[i] = log(self._values[i])
758
return t
759
760
def exp(self):
761
"""
762
Return new time series got by applying the exponential map to
763
all the terms in the time series.
764
765
OUTPUT:
766
767
A new time series.
768
769
EXAMPLES::
770
771
sage: v = finance.TimeSeries([1..5]); v
772
[1.0000, 2.0000, 3.0000, 4.0000, 5.0000]
773
sage: v.exp()
774
[2.7183, 7.3891, 20.0855, 54.5982, 148.4132]
775
sage: v.exp().log()
776
[1.0000, 2.0000, 3.0000, 4.0000, 5.0000]
777
"""
778
cdef Py_ssize_t i
779
cdef TimeSeries t = new_time_series(self._length)
780
for i from 0 <= i < self._length:
781
t._values[i] = exp(self._values[i])
782
return t
783
784
def abs(self):
785
"""
786
Return new time series got by replacing all entries
787
of ``self`` by their absolute value.
788
789
OUTPUT:
790
791
A new time series.
792
793
EXAMPLES::
794
795
sage: v = finance.TimeSeries([1,3.1908439,-4,5.93932])
796
sage: v
797
[1.0000, 3.1908, -4.0000, 5.9393]
798
sage: v.abs()
799
[1.0000, 3.1908, 4.0000, 5.9393]
800
"""
801
cdef Py_ssize_t i
802
cdef TimeSeries t = new_time_series(self._length)
803
for i from 0 <= i < self._length:
804
t._values[i] = self._values[i] if self._values[i] >= 0 else -self._values[i]
805
return t
806
807
def diffs(self, Py_ssize_t k = 1):
808
r"""
809
Return the new time series got by taking the differences of
810
successive terms in the time series. So if ``self`` is the time
811
series `X_0, X_1, X_2, \dots`, then this function outputs the
812
series `X_1 - X_0, X_2 - X_1, \dots`. The output series has one
813
less term than the input series. If the optional parameter
814
`k` is given, return `X_k - X_0, X_{2k} - X_k, \dots`.
815
816
INPUT:
817
818
- ``k`` -- positive integer (default: 1)
819
820
OUTPUT:
821
822
A new time series.
823
824
EXAMPLES::
825
826
sage: v = finance.TimeSeries([5,4,1.3,2,8]); v
827
[5.0000, 4.0000, 1.3000, 2.0000, 8.0000]
828
sage: v.diffs()
829
[-1.0000, -2.7000, 0.7000, 6.0000]
830
"""
831
if k != 1:
832
return self.scale_time(k).diffs()
833
cdef Py_ssize_t i
834
cdef TimeSeries t = new_time_series(self._length - 1)
835
for i from 1 <= i < self._length:
836
t._values[i-1] = self._values[i] - self._values[i-1]
837
return t
838
839
def scale_time(self, Py_ssize_t k):
840
r"""
841
Return the new time series at scale ``k``. If the input
842
time series is `X_0, X_1, X_2, \dots`, then this function
843
returns the shorter time series `X_0, X_k, X_{2k}, \dots`.
844
845
INPUT:
846
847
- ``k`` -- a positive integer.
848
849
OUTPUT:
850
851
A new time series.
852
853
EXAMPLES::
854
855
sage: v = finance.TimeSeries([5,4,1.3,2,8,10,3,-5]); v
856
[5.0000, 4.0000, 1.3000, 2.0000, 8.0000, 10.0000, 3.0000, -5.0000]
857
sage: v.scale_time(1)
858
[5.0000, 4.0000, 1.3000, 2.0000, 8.0000, 10.0000, 3.0000, -5.0000]
859
sage: v.scale_time(2)
860
[5.0000, 1.3000, 8.0000, 3.0000]
861
sage: v.scale_time(3)
862
[5.0000, 2.0000]
863
sage: v.scale_time(10)
864
[]
865
866
A series of odd length::
867
868
sage: v = finance.TimeSeries([1..5]); v
869
[1.0000, 2.0000, 3.0000, 4.0000, 5.0000]
870
sage: v.scale_time(2)
871
[1.0000, 3.0000, 5.0000]
872
873
TESTS::
874
875
sage: v.scale_time(0)
876
Traceback (most recent call last):
877
...
878
ValueError: k must be positive
879
sage: v.scale_time(-1)
880
Traceback (most recent call last):
881
...
882
ValueError: k must be positive
883
"""
884
if k <= 0:
885
raise ValueError, "k must be positive"
886
887
cdef Py_ssize_t i, n
888
n = self._length/k
889
if self._length % 2:
890
n += 1
891
cdef TimeSeries t = new_time_series(n) # in C / is floor division.
892
for i from 0 <= i < n:
893
t._values[i] = self._values[i*k]
894
return t
895
896
cpdef rescale(self, double s):
897
"""
898
Change ``self`` by multiplying every value in the series by ``s``.
899
900
INPUT:
901
902
- ``s`` -- a float.
903
904
EXAMPLES::
905
906
sage: v = finance.TimeSeries([5,4,1.3,2,8,10,3,-5]); v
907
[5.0000, 4.0000, 1.3000, 2.0000, 8.0000, 10.0000, 3.0000, -5.0000]
908
sage: v.rescale(0.5)
909
sage: v
910
[2.5000, 2.0000, 0.6500, 1.0000, 4.0000, 5.0000, 1.5000, -2.5000]
911
"""
912
for i from 0 <= i < self._length:
913
self._values[i] = self._values[i] * s
914
915
def scale(self, double s):
916
"""
917
Return new time series obtained by multiplying every value in the
918
series by ``s``.
919
920
INPUT:
921
922
- ``s`` -- a float.
923
924
OUTPUT:
925
926
A new time series with all values multiplied by ``s``.
927
928
EXAMPLES::
929
930
sage: v = finance.TimeSeries([5,4,1.3,2,8,10,3,-5]); v
931
[5.0000, 4.0000, 1.3000, 2.0000, 8.0000, 10.0000, 3.0000, -5.0000]
932
sage: v.scale(0.5)
933
[2.5000, 2.0000, 0.6500, 1.0000, 4.0000, 5.0000, 1.5000, -2.5000]
934
"""
935
cdef TimeSeries t = new_time_series(self._length)
936
for i from 0 <= i < self._length:
937
t._values[i] = self._values[i] * s
938
return t
939
940
def add_scalar(self, double s):
941
"""
942
Return new time series obtained by adding a scalar to every
943
value in the series.
944
945
.. NOTE::
946
947
To add componentwise, use the ``add_entries`` method.
948
949
INPUT:
950
951
- ``s`` -- a float.
952
953
OUTPUT:
954
955
A new time series with ``s`` added to all values.
956
957
EXAMPLES::
958
959
sage: v = finance.TimeSeries([5,4,1.3,2,8,10,3,-5]); v
960
[5.0000, 4.0000, 1.3000, 2.0000, 8.0000, 10.0000, 3.0000, -5.0000]
961
sage: v.add_scalar(0.5)
962
[5.5000, 4.5000, 1.8000, 2.5000, 8.5000, 10.5000, 3.5000, -4.5000]
963
"""
964
cdef TimeSeries t = new_time_series(self._length)
965
for i from 0 <= i < self._length:
966
t._values[i] = self._values[i] + s
967
return t
968
969
def add_entries(self, t):
970
"""
971
Add corresponding entries of ``self`` and ``t`` together,
972
extending either ``self`` or ``t`` by 0's if they do
973
not have the same length.
974
975
.. NOTE::
976
977
To add a single number to the entries of a time series,
978
use the ``add_scalar`` method.
979
980
INPUT:
981
982
- ``t`` -- a time series.
983
984
OUTPUT:
985
986
A time series with length the maxima of the lengths of
987
``self`` and ``t``.
988
989
EXAMPLES::
990
991
sage: v = finance.TimeSeries([1,2,-5]); v
992
[1.0000, 2.0000, -5.0000]
993
sage: v.add_entries([3,4])
994
[4.0000, 6.0000, -5.0000]
995
sage: v.add_entries(v)
996
[2.0000, 4.0000, -10.0000]
997
sage: v.add_entries([3,4,7,18.5])
998
[4.0000, 6.0000, 2.0000, 18.5000]
999
"""
1000
if not PY_TYPE_CHECK(t, TimeSeries):
1001
t = TimeSeries(t)
1002
cdef Py_ssize_t i, n
1003
cdef TimeSeries T = t, shorter, longer
1004
if self._length > T._length:
1005
shorter = T
1006
longer = self
1007
else:
1008
shorter = self
1009
longer = T
1010
1011
n = longer._length
1012
cdef TimeSeries v = new_time_series(n)
1013
1014
# Iterate up to the length of the shorter one
1015
for i from 0 <= i < shorter._length:
1016
v._values[i] = shorter._values[i] + longer._values[i]
1017
1018
# Copy over the rest of the values
1019
if shorter._length != n:
1020
memcpy(v._values + shorter._length,
1021
longer._values + shorter._length,
1022
sizeof(double)*(v._length - shorter._length))
1023
1024
return v
1025
1026
def show(self, *args, **kwds):
1027
"""
1028
Calls plot and passes all arguments onto the plot function. This is
1029
thus just an alias for plot.
1030
1031
EXAMPLES:
1032
1033
Draw a plot of a time series::
1034
1035
sage: finance.TimeSeries([1..10]).show()
1036
"""
1037
return self.plot(*args, **kwds)
1038
1039
def plot(self, Py_ssize_t plot_points=1000, points=False, **kwds):
1040
r"""
1041
Return a plot of this time series as a line or points through
1042
`(i,T(i))`, where `i` ranges over nonnegative integers up to the
1043
length of ``self``.
1044
1045
INPUT:
1046
1047
- ``plot_points`` -- (default: 1000) 0 or positive integer. Only
1048
plot the given number of equally spaced points in the time series.
1049
If 0, plot all points.
1050
1051
- ``points`` -- bool (default: ``False``). If ``True``, return just
1052
the points of the time series.
1053
1054
- ``**kwds`` -- passed to the line or point command.
1055
1056
EXAMPLES::
1057
1058
sage: v = finance.TimeSeries([5,4,1.3,2,8,10,3,-5]); v
1059
[5.0000, 4.0000, 1.3000, 2.0000, 8.0000, 10.0000, 3.0000, -5.0000]
1060
sage: v.plot()
1061
sage: v.plot(points=True)
1062
sage: v.plot() + v.plot(points=True, rgbcolor='red')
1063
sage: v.plot() + v.plot(points=True, rgbcolor='red', pointsize=50)
1064
"""
1065
from sage.plot.all import line, point
1066
cdef Py_ssize_t s
1067
1068
if self._length < plot_points:
1069
plot_points = 0
1070
1071
if plot_points > 0:
1072
s = self._length/plot_points
1073
if plot_points <= 0:
1074
raise ValueError, "plot_points must be a positive integer"
1075
v = self.scale_time(s).list()[:plot_points]
1076
else:
1077
s = 1
1078
v = self.list()
1079
w = [(i * s, y) for i,y in enumerate(v)]
1080
if points:
1081
L = point(w, **kwds)
1082
else:
1083
L = line(w, **kwds)
1084
L.axes_range(ymin=min(v), ymax=max(v), xmin=0, xmax=len(v)*s)
1085
return L
1086
1087
def simple_moving_average(self, Py_ssize_t k):
1088
"""
1089
Return the moving average time series over the last ``k`` time units.
1090
Assumes the input time series was constant with its starting value
1091
for negative time. The t-th step of the output is the sum of
1092
the previous ``k - 1`` steps of ``self`` and the ``k``-th step
1093
divided by ``k``. Thus ``k`` values are averaged at each point.
1094
1095
INPUT:
1096
1097
- ``k`` -- positive integer.
1098
1099
OUTPUT:
1100
1101
A time series with the same number of steps as ``self``.
1102
1103
EXAMPLES::
1104
1105
sage: v = finance.TimeSeries([1,1,1,2,3]); v
1106
[1.0000, 1.0000, 1.0000, 2.0000, 3.0000]
1107
sage: v.simple_moving_average(0)
1108
[1.0000, 1.0000, 1.0000, 2.0000, 3.0000]
1109
sage: v.simple_moving_average(1)
1110
[1.0000, 1.0000, 1.0000, 2.0000, 3.0000]
1111
sage: v.simple_moving_average(2)
1112
[1.0000, 1.0000, 1.0000, 1.5000, 2.5000]
1113
sage: v.simple_moving_average(3)
1114
[1.0000, 1.0000, 1.0000, 1.3333, 2.0000]
1115
"""
1116
if k == 0 or k == 1:
1117
return self.__copy__()
1118
if k <= 0:
1119
raise ValueError, "k must be positive"
1120
cdef Py_ssize_t i
1121
cdef TimeSeries t = new_time_series(self._length)
1122
if self._length == 0:
1123
return t
1124
1125
cdef double s = self._values[0] * k
1126
for i from 0 <= i < self._length:
1127
if i >= k:
1128
s -= self._values[i-k]
1129
else:
1130
s -= self._values[0]
1131
# I have a slight concern about accumulated rounding error given how
1132
# this algorithm adds and subtracts.
1133
s += self._values[i]
1134
t._values[i] = s/k
1135
return t
1136
1137
def exponential_moving_average(self, double alpha):
1138
"""
1139
Return the exponential moving average time series. Assumes
1140
the input time series was constant with its starting value for
1141
negative time. The t-th step of the output is the sum of the
1142
previous k-1 steps of ``self`` and the k-th step divided by k.
1143
1144
The 0-th term is formally undefined, so we define it to be 0,
1145
and we define the first term to be ``self[0]``.
1146
1147
INPUT:
1148
1149
- ``alpha`` -- float; a smoothing factor with ``0 <= alpha <= 1``.
1150
1151
OUTPUT:
1152
1153
A time series with the same number of steps as ``self``.
1154
1155
EXAMPLES::
1156
1157
sage: v = finance.TimeSeries([1,1,1,2,3]); v
1158
[1.0000, 1.0000, 1.0000, 2.0000, 3.0000]
1159
sage: v.exponential_moving_average(0)
1160
[0.0000, 1.0000, 1.0000, 1.0000, 1.0000]
1161
sage: v.exponential_moving_average(1)
1162
[0.0000, 1.0000, 1.0000, 1.0000, 2.0000]
1163
sage: v.exponential_moving_average(0.5)
1164
[0.0000, 1.0000, 1.0000, 1.0000, 1.5000]
1165
1166
Some more examples::
1167
1168
sage: v = finance.TimeSeries([1,2,3,4,5])
1169
sage: v.exponential_moving_average(1)
1170
[0.0000, 1.0000, 2.0000, 3.0000, 4.0000]
1171
sage: v.exponential_moving_average(0)
1172
[0.0000, 1.0000, 1.0000, 1.0000, 1.0000]
1173
"""
1174
if alpha < 0 or alpha > 1:
1175
raise ValueError, "alpha must be between 0 and 1"
1176
cdef Py_ssize_t i
1177
cdef TimeSeries t = new_time_series(self._length)
1178
if self._length == 0:
1179
return t
1180
t._values[0] = 0
1181
if self._length == 1:
1182
return t
1183
t._values[1] = self._values[0]
1184
for i from 2 <= i < self._length:
1185
t._values[i] = alpha * self._values[i-1] + (1-alpha) *t._values[i-1]
1186
return t
1187
1188
def sums(self, double s=0):
1189
"""
1190
Return the new time series got by taking the running partial
1191
sums of the terms of this time series.
1192
1193
INPUT:
1194
1195
- ``s`` -- starting value for partial sums.
1196
1197
OUTPUT:
1198
1199
A time series.
1200
1201
EXAMPLES::
1202
1203
sage: v = finance.TimeSeries([1,1,1,2,3]); v
1204
[1.0000, 1.0000, 1.0000, 2.0000, 3.0000]
1205
sage: v.sums()
1206
[1.0000, 2.0000, 3.0000, 5.0000, 8.0000]
1207
"""
1208
cdef Py_ssize_t i
1209
cdef TimeSeries t = new_time_series(self._length)
1210
for i from 0 <= i < self._length:
1211
s += self._values[i]
1212
t._values[i] = s
1213
return t
1214
1215
cpdef double sum(self):
1216
"""
1217
Return the sum of all the entries of ``self``. If ``self`` has
1218
length 0, returns 0.
1219
1220
OUTPUT:
1221
1222
A double.
1223
1224
EXAMPLES::
1225
1226
sage: v = finance.TimeSeries([1,1,1,2,3]); v
1227
[1.0000, 1.0000, 1.0000, 2.0000, 3.0000]
1228
sage: v.sum()
1229
8.0
1230
"""
1231
cdef double s = 0
1232
cdef Py_ssize_t i
1233
for i from 0 <= i < self._length:
1234
s += self._values[i]
1235
return s
1236
1237
def prod(self):
1238
"""
1239
Return the product of all the entries of ``self``. If ``self`` has
1240
length 0, returns 1.
1241
1242
OUTPUT:
1243
1244
A double.
1245
1246
EXAMPLES::
1247
1248
sage: v = finance.TimeSeries([1,1,1,2,3]); v
1249
[1.0000, 1.0000, 1.0000, 2.0000, 3.0000]
1250
sage: v.prod()
1251
6.0
1252
"""
1253
cdef double s = 1
1254
cdef Py_ssize_t i
1255
for i from 0 <= i < self._length:
1256
s *= self._values[i]
1257
return s
1258
1259
1260
def mean(self):
1261
"""
1262
Return the mean (average) of the elements of ``self``.
1263
1264
OUTPUT:
1265
1266
A double.
1267
1268
EXAMPLES::
1269
1270
sage: v = finance.TimeSeries([1,1,1,2,3]); v
1271
[1.0000, 1.0000, 1.0000, 2.0000, 3.0000]
1272
sage: v.mean()
1273
1.6
1274
"""
1275
return self.sum() / self._length
1276
1277
def pow(self, double k):
1278
"""
1279
Return a new time series with every elements of ``self`` raised to the
1280
k-th power.
1281
1282
INPUT:
1283
1284
- ``k`` -- a float.
1285
1286
OUTPUT:
1287
1288
A time series.
1289
1290
EXAMPLES::
1291
1292
sage: v = finance.TimeSeries([1,1,1,2,3]); v
1293
[1.0000, 1.0000, 1.0000, 2.0000, 3.0000]
1294
sage: v.pow(2)
1295
[1.0000, 1.0000, 1.0000, 4.0000, 9.0000]
1296
"""
1297
cdef Py_ssize_t i
1298
cdef TimeSeries t = new_time_series(self._length)
1299
for i from 0 <= i < self._length:
1300
t._values[i] = pow(self._values[i], k)
1301
return t
1302
1303
def moment(self, int k):
1304
"""
1305
Return the k-th moment of ``self``, which is just the
1306
mean of the k-th powers of the elements of ``self``.
1307
1308
INPUT:
1309
1310
- ``k`` -- a positive integer.
1311
1312
OUTPUT:
1313
1314
A double.
1315
1316
EXAMPLES::
1317
1318
sage: v = finance.TimeSeries([1,1,1,2,3]); v
1319
[1.0000, 1.0000, 1.0000, 2.0000, 3.0000]
1320
sage: v.moment(1)
1321
1.6
1322
sage: v.moment(2)
1323
3.2
1324
"""
1325
if k <= 0:
1326
raise ValueError, "k must be positive"
1327
if k == 1:
1328
return self.mean()
1329
cdef double s = 0
1330
cdef Py_ssize_t i
1331
for i from 0 <= i < self._length:
1332
s += pow(self._values[i], k)
1333
return s / self._length
1334
1335
def central_moment(self, int k):
1336
"""
1337
Return the k-th central moment of ``self``, which is just the mean
1338
of the k-th powers of the differences ``self[i] - mu``, where ``mu`` is
1339
the mean of ``self``.
1340
1341
INPUT:
1342
1343
- ``k`` -- a positive integer.
1344
1345
OUTPUT:
1346
1347
A double.
1348
1349
EXAMPLES::
1350
1351
sage: v = finance.TimeSeries([1,2,3])
1352
sage: v.central_moment(2)
1353
0.6666666666666666
1354
1355
Note that the central moment is different from the moment
1356
here, since the mean is not `0`::
1357
1358
sage: v.moment(2) # doesn't subtract off mean
1359
4.666666666666667
1360
1361
We compute the central moment directly::
1362
1363
sage: mu = v.mean(); mu
1364
2.0
1365
sage: ((1-mu)^2 + (2-mu)^2 + (3-mu)^2) / 3
1366
0.6666666666666666
1367
"""
1368
if k == 1:
1369
return float(0)
1370
mu = self.mean()
1371
# We could make this slightly faster by doing the add scalar
1372
# and moment calculation together. But that would be nasty.
1373
return self.add_scalar(-mu).moment(k)
1374
1375
def covariance(self, TimeSeries other):
1376
r"""
1377
Return the covariance of the time series ``self`` and ``other``.
1378
1379
INPUT:
1380
1381
- ``self``, ``other`` -- time series.
1382
1383
Whichever time series has more terms is truncated.
1384
1385
EXAMPLES::
1386
1387
sage: v = finance.TimeSeries([1,-2,3]); w = finance.TimeSeries([4,5,-10])
1388
sage: v.covariance(w)
1389
-11.777777777777779
1390
"""
1391
cdef double mu = self.mean(), mu2 = other.mean()
1392
cdef double s = 0
1393
cdef Py_ssize_t i
1394
cdef Py_ssize_t n = min(self._length, other._length)
1395
for i from 0 <= i < n:
1396
s += (self._values[i] - mu)*(other._values[i] - mu2)
1397
return s / n
1398
# NOTE: There is also a formula
1399
# self._values[i]*other._values[i] - mu*mu2
1400
# but that seems less numerically stable (?), and when tested
1401
# was not noticeably faster.
1402
1403
def autocovariance(self, Py_ssize_t k=0):
1404
r"""
1405
Return the k-th autocovariance function `\gamma(k)` of ``self``.
1406
This is the covariance of ``self`` with ``self`` shifted by `k`, i.e.,
1407
1408
.. MATH::
1409
1410
\left.
1411
\left( \sum_{t=0}^{n-k-1} (x_t - \mu)(x_{t + k} - \mu) \right)
1412
\right/ n,
1413
1414
where `n` is the length of ``self``.
1415
1416
Note the denominator of `n`, which gives a "better" sample
1417
estimator.
1418
1419
INPUT:
1420
1421
- ``k`` -- a nonnegative integer (default: 0)
1422
1423
OUTPUT:
1424
1425
A float.
1426
1427
EXAMPLES::
1428
1429
sage: v = finance.TimeSeries([13,8,15,4,4,12,11,7,14,12])
1430
sage: v.autocovariance(0)
1431
14.4
1432
sage: mu = v.mean(); sum([(a-mu)^2 for a in v])/len(v)
1433
14.4
1434
sage: v.autocovariance(1)
1435
-2.7
1436
sage: mu = v.mean(); sum([(v[i]-mu)*(v[i+1]-mu) for i in range(len(v)-1)])/len(v)
1437
-2.7
1438
sage: v.autocovariance(1)
1439
-2.7
1440
1441
We illustrate with a random sample that an independently and
1442
identically distributed distribution with zero mean and
1443
variance `\sigma^2` has autocovariance function `\gamma(h)`
1444
with `\gamma(0) = \sigma^2` and `\gamma(h) = 0` for `h\neq 0`. ::
1445
1446
sage: set_random_seed(0)
1447
sage: v = finance.TimeSeries(10^6)
1448
sage: v.randomize('normal', 0, 5)
1449
[3.3835, -2.0055, 1.7882, -2.9319, -4.6827 ... -5.1868, 9.2613, 0.9274, -6.2282, -8.7652]
1450
sage: v.autocovariance(0)
1451
24.95410689...
1452
sage: v.autocovariance(1)
1453
-0.00508390...
1454
sage: v.autocovariance(2)
1455
0.022056325...
1456
sage: v.autocovariance(3)
1457
-0.01902000...
1458
"""
1459
cdef double mu = self.mean()
1460
cdef double s = 0
1461
cdef Py_ssize_t i
1462
cdef Py_ssize_t n = self._length - k
1463
for i from 0 <= i < n:
1464
s += (self._values[i] - mu)*(self._values[i+k] - mu)
1465
return s / self._length
1466
1467
def correlation(self, TimeSeries other):
1468
"""
1469
Return the correlation of ``self`` and ``other``, which is the
1470
covariance of ``self`` and ``other`` divided by the product of their
1471
standard deviation.
1472
1473
INPUT:
1474
1475
- ``self``, ``other`` -- time series.
1476
1477
Whichever time series has more terms is truncated.
1478
1479
EXAMPLES::
1480
1481
sage: v = finance.TimeSeries([1,-2,3]); w = finance.TimeSeries([4,5,-10])
1482
sage: v.correlation(w)
1483
-0.558041609...
1484
sage: v.covariance(w)/(v.standard_deviation() * w.standard_deviation())
1485
-0.558041609...
1486
"""
1487
return self.covariance(other) / (self.standard_deviation() * other.standard_deviation())
1488
1489
def autocorrelation(self, Py_ssize_t k=1):
1490
r"""
1491
Return the k-th sample autocorrelation of this time series
1492
`x_i`.
1493
1494
Let `\mu` be the sample mean. Then the sample autocorrelation
1495
function is
1496
1497
.. MATH::
1498
1499
\frac{\sum_{t=0}^{n-k-1} (x_t - \mu)(x_{t+k} - \mu) }
1500
{\sum_{t=0}^{n-1} (x_t - \mu)^2}.
1501
1502
Note that the variance must be nonzero or you will get a
1503
``ZeroDivisionError``.
1504
1505
INPUT:
1506
1507
- ``k`` -- a nonnegative integer (default: 1)
1508
1509
OUTPUT:
1510
1511
A time series.
1512
1513
EXAMPLE::
1514
1515
sage: v = finance.TimeSeries([13,8,15,4,4,12,11,7,14,12])
1516
sage: v.autocorrelation()
1517
-0.1875
1518
sage: v.autocorrelation(1)
1519
-0.1875
1520
sage: v.autocorrelation(0)
1521
1.0
1522
sage: v.autocorrelation(2)
1523
-0.20138888888888887
1524
sage: v.autocorrelation(3)
1525
0.18055555555555555
1526
1527
sage: finance.TimeSeries([1..1000]).autocorrelation()
1528
0.997
1529
"""
1530
return self.autocovariance(k) / self.variance(bias=True)
1531
1532
def variance(self, bias=False):
1533
"""
1534
Return the variance of the elements of ``self``, which is the mean
1535
of the squares of the differences from the mean.
1536
1537
INPUT:
1538
1539
- ``bias`` -- bool (default: ``False``); if ``False``, divide by
1540
``self.length() - 1`` instead of ``self.length()`` to give a less
1541
biased estimator for the variance.
1542
1543
OUTPUT:
1544
1545
A double.
1546
1547
EXAMPLE::
1548
1549
sage: v = finance.TimeSeries([1,1,1,2,3]); v
1550
[1.0000, 1.0000, 1.0000, 2.0000, 3.0000]
1551
sage: v.variance()
1552
0.8
1553
sage: v.variance(bias=True)
1554
0.64
1555
1556
TESTS::
1557
1558
sage: finance.TimeSeries([1]).variance()
1559
0.0
1560
sage: finance.TimeSeries([]).variance()
1561
0.0
1562
"""
1563
if self._length <= 1:
1564
return float(0)
1565
cdef double mu = self.mean()
1566
cdef double s = 0
1567
cdef double a
1568
cdef Py_ssize_t i
1569
for i from 0 <= i < self._length:
1570
a = self._values[i] - mu
1571
s += a * a
1572
if bias:
1573
return s / self._length
1574
else:
1575
return s / (self._length - 1)
1576
1577
def standard_deviation(self, bias=False):
1578
"""
1579
Return the standard deviation of the entries of ``self``.
1580
1581
INPUT:
1582
1583
- ``bias`` -- bool (default: ``False``); if ``False``, divide by
1584
``self.length() - 1`` instead of ``self.length()`` to give a less
1585
biased estimator for the variance.
1586
1587
OUTPUT:
1588
1589
A double.
1590
1591
EXAMPLES::
1592
1593
sage: v = finance.TimeSeries([1,1,1,2,3]); v
1594
[1.0000, 1.0000, 1.0000, 2.0000, 3.0000]
1595
sage: v.standard_deviation()
1596
0.8944271909...
1597
sage: v.standard_deviation(bias=True)
1598
0.8
1599
1600
TESTS::
1601
1602
sage: finance.TimeSeries([1]).standard_deviation()
1603
0.0
1604
sage: finance.TimeSeries([]).standard_deviation()
1605
0.0
1606
"""
1607
return sqrt(self.variance(bias=bias))
1608
1609
def range_statistic(self, b=None):
1610
r"""
1611
Return the rescaled range statistic `R/S` of ``self``, which is
1612
defined as follows (see Hurst 1951). If the optional
1613
parameter ``b`` is given, return the average of `R/S` range
1614
statistics of disjoint blocks of size ``b``.
1615
1616
Let `\sigma` be the standard deviation of the sequence of
1617
differences of ``self``, and let `Y_k` be the k-th term of ``self``.
1618
Let `n` be the number of terms of ``self``, and set
1619
`Z_k = Y_k - ((k+1)/n) \cdot Y_n`. Then
1620
1621
.. MATH::
1622
1623
R/S = \big( \max(Z_k) - \min(Z_k) \big) / \sigma
1624
1625
where the max and min are over all `Z_k`.
1626
Basically replacing `Y_k` by `Z_k` allows us to measure
1627
the difference from the line from the origin to `(n,Y_n)`.
1628
1629
INPUT:
1630
1631
- ``self`` -- a time series (*not* the series of differences).
1632
1633
- ``b`` -- integer (default: ``None``); if given instead divide the
1634
input time series up into ``j = floor(n/b)`` disjoint
1635
blocks, compute the `R/S` statistic for each block,
1636
and return the average of those `R/S` statistics.
1637
1638
OUTPUT:
1639
1640
A float.
1641
1642
EXAMPLES:
1643
1644
Notice that if we make a Brownian motion random walk, there
1645
is no difference if we change the standard deviation. ::
1646
1647
sage: set_random_seed(0); finance.TimeSeries(10^6).randomize('normal').sums().range_statistic()
1648
1897.8392602...
1649
sage: set_random_seed(0); finance.TimeSeries(10^6).randomize('normal',0,100).sums().range_statistic()
1650
1897.8392602...
1651
"""
1652
cdef Py_ssize_t j, k, n = self._length
1653
1654
if b is not None:
1655
j = n // b
1656
return sum([self[k*b:(k+1)*b].add_scalar(-self[k*b]).range_statistic() for k in range(j)]) / j
1657
1658
cdef double Yn = self._values[n - 1]
1659
cdef TimeSeries Z = self.__copy__()
1660
for k from 0 <= k < n:
1661
Z._values[k] -= (k+1)*Yn / n
1662
1663
sigma = self.diffs().standard_deviation()
1664
1665
return (Z.max() - Z.min()) / sigma
1666
1667
def hurst_exponent(self):
1668
"""
1669
Returns an estimate of the Hurst exponent of this time series.
1670
We use the algorithm from pages 61 -- 63 of [Peteres, Fractal
1671
Market Analysis (1994); see Google Books].
1672
1673
We define the Hurst exponent of a constant time series to be 1.
1674
1675
EXAMPLES:
1676
1677
The Hurst exponent of Brownian motion is 1/2. We approximate
1678
it with some specific samples. Note that the estimator is
1679
biased and slightly overestimates. ::
1680
1681
sage: set_random_seed(0)
1682
sage: bm = finance.TimeSeries(10^5).randomize('normal').sums(); bm
1683
[0.6767, 0.2756, 0.6332, 0.0469, -0.8897 ... 152.2437, 151.5327, 152.7629, 152.9169, 152.9084]
1684
sage: bm.hurst_exponent()
1685
0.527450972...
1686
1687
We compute the Hurst exponent of a simulated fractional Brownian
1688
motion with Hurst parameter 0.7. This function estimates the
1689
Hurst exponent as 0.706511951... ::
1690
1691
sage: set_random_seed(0)
1692
sage: fbm = finance.fractional_brownian_motion_simulation(0.7,0.1,10^5,1)[0]
1693
sage: fbm.hurst_exponent()
1694
0.706511951...
1695
1696
Another example with small Hurst exponent (notice the overestimation).
1697
1698
::
1699
1700
sage: fbm = finance.fractional_brownian_motion_simulation(0.2,0.1,10^5,1)[0]
1701
sage: fbm.hurst_exponent()
1702
0.278997441...
1703
1704
We compute the mean Hurst exponent of 100 simulated multifractal
1705
cascade random walks::
1706
1707
sage: set_random_seed(0)
1708
sage: y = finance.multifractal_cascade_random_walk_simulation(3700,0.02,0.01,0.01,1000,100)
1709
sage: finance.TimeSeries([z.hurst_exponent() for z in y]).mean()
1710
0.57984822577934...
1711
1712
We compute the mean Hurst exponent of 100 simulated Markov switching
1713
multifractal time series. The Hurst exponent is quite small. ::
1714
1715
sage: set_random_seed(0)
1716
sage: msm = finance.MarkovSwitchingMultifractal(8,1.4,0.5,0.95,3)
1717
sage: y = msm.simulations(1000,100)
1718
sage: finance.TimeSeries([z.hurst_exponent() for z in y]).mean()
1719
0.286102325623705...
1720
"""
1721
# We use disjoint blocks of size 8, 16, 32, ....
1722
cdef Py_ssize_t k = 8
1723
if self._length <= 32: # small data, estimate of Hurst will suck but...
1724
k = 4
1725
v0 = []
1726
v1 = []
1727
while k <= self._length:
1728
try:
1729
v1.append(log(self.range_statistic(k)))
1730
v0.append(log(k))
1731
except ZeroDivisionError: # 0 standard deviation
1732
pass
1733
k *= 2
1734
if len(v0) == 0:
1735
return float(1)
1736
if len(v0) == 1:
1737
return v1[0]/v0[0]
1738
import numpy
1739
coeffs = numpy.polyfit(v0,v1,1)
1740
return coeffs[0]
1741
1742
def min(self, bint index=False):
1743
"""
1744
Return the smallest value in this time series. If this series
1745
has length 0 we raise a ``ValueError``.
1746
1747
INPUT:
1748
1749
- ``index`` -- bool (default: ``False``); if ``True``, also return
1750
index of minimal entry.
1751
1752
OUTPUT:
1753
1754
- float -- smallest value.
1755
1756
- integer -- index of smallest value; only returned if ``index=True``.
1757
1758
EXAMPLES::
1759
1760
sage: v = finance.TimeSeries([1,-4,3,-2.5,-4])
1761
sage: v.min()
1762
-4.0
1763
sage: v.min(index=True)
1764
(-4.0, 1)
1765
"""
1766
if self._length == 0:
1767
raise ValueError, "min() arg is an empty sequence"
1768
cdef Py_ssize_t i, j
1769
cdef double s = self._values[0]
1770
j = 0
1771
for i from 1 <= i < self._length:
1772
if self._values[i] < s:
1773
s = self._values[i]
1774
j = i
1775
if index:
1776
return s, j
1777
else:
1778
return s
1779
1780
def max(self, bint index=False):
1781
"""
1782
Return the largest value in this time series. If this series
1783
has length 0 we raise a ``ValueError``.
1784
1785
INPUT:
1786
1787
- ``index`` -- bool (default: ``False``); if ``True``, also return
1788
index of maximum entry.
1789
1790
OUTPUT:
1791
1792
- float -- largest value.
1793
1794
- integer -- index of largest value; only returned if ``index=True``.
1795
1796
EXAMPLES::
1797
1798
sage: v = finance.TimeSeries([1,-4,3,-2.5,-4,3])
1799
sage: v.max()
1800
3.0
1801
sage: v.max(index=True)
1802
(3.0, 2)
1803
"""
1804
if self._length == 0:
1805
raise ValueError, "max() arg is an empty sequence"
1806
cdef Py_ssize_t i, j = 0
1807
cdef double s = self._values[0]
1808
for i from 1 <= i < self._length:
1809
if self._values[i] > s:
1810
s = self._values[i]
1811
j = i
1812
if index:
1813
return s, j
1814
else:
1815
return s
1816
1817
def clip_remove(self, min=None, max=None):
1818
"""
1819
Return new time series obtained from ``self`` by removing all
1820
values that are less than or equal to a certain minimum value
1821
or greater than or equal to a certain maximum.
1822
1823
INPUT:
1824
1825
- ``min`` -- (default: ``None``) ``None`` or double.
1826
1827
- ``max`` -- (default: ``None``) ``None`` or double.
1828
1829
OUTPUT:
1830
1831
A time series.
1832
1833
EXAMPLES::
1834
1835
sage: v = finance.TimeSeries([1..10])
1836
sage: v.clip_remove(3,7)
1837
[3.0000, 4.0000, 5.0000, 6.0000, 7.0000]
1838
sage: v.clip_remove(3)
1839
[3.0000, 4.0000, 5.0000, 6.0000, 7.0000, 8.0000, 9.0000, 10.0000]
1840
sage: v.clip_remove(max=7)
1841
[1.0000, 2.0000, 3.0000, 4.0000, 5.0000, 6.0000, 7.0000]
1842
"""
1843
cdef Py_ssize_t i, j
1844
cdef TimeSeries t
1845
cdef double mn, mx
1846
cdef double x
1847
if min is None and max is None:
1848
return self.copy()
1849
# This code is ugly but I think as fast as possible.
1850
if min is None and max is not None:
1851
# Return everything <= max
1852
j = 0
1853
mx = max
1854
for i from 0 <= i < self._length:
1855
if self._values[i] <= mx:
1856
j += 1
1857
1858
t = TimeSeries(j)
1859
j = 0
1860
for i from 0 <= i < self._length:
1861
if self._values[i] <= mx:
1862
t._values[j] = self._values[i]
1863
j += 1
1864
elif max is None and min is not None:
1865
# Return everything >= min
1866
j = 0
1867
mn = min
1868
for i from 0 <= i < self._length:
1869
if self._values[i] >= mn:
1870
j += 1
1871
t = TimeSeries(j)
1872
j = 0
1873
for i from 0 <= i < self._length:
1874
if self._values[i] >= mn:
1875
t._values[j] = self._values[i]
1876
j += 1
1877
else:
1878
# Return everything between min and max
1879
j = 0
1880
mn = min; mx = max
1881
for i from 0 <= i < self._length:
1882
x = self._values[i]
1883
if x >= mn and x <= mx:
1884
j += 1
1885
t = TimeSeries(j)
1886
j = 0
1887
for i from 0 <= i < self._length:
1888
x = self._values[i]
1889
if x >= mn and x <= mx:
1890
t._values[j] = x
1891
j += 1
1892
return t
1893
1894
def histogram(self, Py_ssize_t bins=50, bint normalize=False):
1895
"""
1896
Return the frequency histogram of the values in
1897
this time series divided up into the given
1898
number of bins.
1899
1900
INPUT:
1901
1902
- ``bins`` -- a positive integer (default: 50)
1903
1904
- ``normalize`` -- (default: ``False``) whether to normalize so the
1905
total area in the bars of the histogram is 1.
1906
1907
OUTPUT:
1908
1909
- counts -- list of counts of numbers of elements in
1910
each bin.
1911
1912
- endpoints -- list of 2-tuples (a,b) that give the
1913
endpoints of the bins.
1914
1915
EXAMPLES::
1916
1917
sage: v = finance.TimeSeries([5,4,1.3,2,8,10,3,-5]); v
1918
[5.0000, 4.0000, 1.3000, 2.0000, 8.0000, 10.0000, 3.0000, -5.0000]
1919
sage: v.histogram(3)
1920
([1, 4, 3], [(-5.0, 0.0), (0.0, 5.0), (5.0, 10.0)])
1921
"""
1922
if bins <= 0:
1923
raise ValueError, "bins must be positive"
1924
1925
cdef double mn = self.min(), mx = self.max()
1926
cdef double r = mx - mn, step = r/bins
1927
cdef Py_ssize_t j
1928
1929
if r == 0:
1930
raise ValueError, "bins have 0 width"
1931
1932
v = [(mn + j*step, mn + (j+1)*step) for j in range(bins)]
1933
if self._length == 0:
1934
return [], v
1935
1936
if step == 0:
1937
counts = [0]*bins
1938
counts[0] = [self._length]
1939
return counts, v
1940
1941
cdef Py_ssize_t i
1942
cdef Py_ssize_t* cnts = <Py_ssize_t*>sage_malloc(sizeof(Py_ssize_t)*bins)
1943
if cnts == NULL:
1944
raise MemoryError
1945
for i from 0 <= i < bins:
1946
cnts[i] = 0
1947
1948
for i from 0 <= i < self._length:
1949
j = int((self._values[i] - mn)/step)
1950
if j >= bins:
1951
j = bins-1
1952
cnts[j] += 1
1953
1954
b = 1.0/(self._length * step)
1955
if normalize:
1956
counts = [cnts[i]*b for i in range(bins)]
1957
else:
1958
counts = [cnts[i] for i in range(bins)]
1959
sage_free(cnts)
1960
1961
return counts, v
1962
1963
def plot_histogram(self, bins=50, normalize=True, **kwds):
1964
"""
1965
Return histogram plot of this time series with given number of bins.
1966
1967
INPUT:
1968
1969
- ``bins`` -- positive integer (default: 50)
1970
1971
- ``normalize`` -- (default: ``True``) whether to normalize so the
1972
total area in the bars of the histogram is 1.
1973
1974
OUTPUT:
1975
1976
A histogram plot.
1977
1978
EXAMPLES::
1979
1980
sage: v = finance.TimeSeries([1..50])
1981
sage: v.plot_histogram(bins=10)
1982
1983
::
1984
1985
sage: v.plot_histogram(bins=3,normalize=False,aspect_ratio=1)
1986
"""
1987
from sage.plot.all import polygon
1988
counts, intervals = self.histogram(bins, normalize=normalize)
1989
s = 0
1990
kwds.setdefault('aspect_ratio','automatic')
1991
for i, (x0,x1) in enumerate(intervals):
1992
s += polygon([(x0,0), (x0,counts[i]), (x1,counts[i]), (x1,0)], **kwds)
1993
if len(intervals) > 0:
1994
s.axes_range(ymin=0, ymax=max(counts), xmin=intervals[0][0], xmax=intervals[-1][1])
1995
return s
1996
1997
def plot_candlestick(self, int bins=30):
1998
"""
1999
Return a candlestick plot of this time series with the given number
2000
of bins.
2001
2002
A candlestick plot is a style of bar-chart used to view open, high,
2003
low, and close stock data. At each bin, the line represents the
2004
high / low range. The bar represents the open / close range. The
2005
interval is colored blue if the open for that bin is less than the
2006
close. If the close is less than the open, then that bin is colored
2007
red instead.
2008
2009
INPUT:
2010
2011
- ``bins`` -- positive integer (default: 30), the number of bins
2012
or candles.
2013
2014
OUTPUT:
2015
2016
A candlestick plot.
2017
2018
EXAMPLES:
2019
2020
Here we look at the candlestick plot for Brownian motion::
2021
2022
sage: v = finance.TimeSeries(1000).randomize()
2023
sage: v.plot_candlestick(bins=20)
2024
"""
2025
from sage.plot.all import line, polygon, Graphics
2026
2027
cdef TimeSeries t = new_time_series(self._length)
2028
cdef TimeSeries s
2029
cdef int i, j, n
2030
bin_size = int(t._length/bins)
2031
rng = t._length - bin_size
2032
2033
memcpy(t._values, self._values, sizeof(double)*self._length)
2034
p = Graphics()
2035
2036
for i from 0 <= i < bins:
2037
n = i*bin_size
2038
s = new_time_series(bin_size)
2039
for j from 0 <= j < bin_size:
2040
s._values[j] = t._values[n + j]
2041
low = s.min()
2042
high = s.max()
2043
open = s._values[0]
2044
close = s._values[bin_size-1]
2045
left = n + bin_size/3
2046
mid = n + bin_size/2
2047
right = n + 2*bin_size/3
2048
2049
rgbcolor = 'blue' if open < close else 'red'
2050
2051
p += line([(mid, low), (mid, high)], rgbcolor=rgbcolor)
2052
p += polygon([(left, open), (right, open), (right, close), (left, close)], rgbcolor=rgbcolor)
2053
2054
return p
2055
2056
def numpy(self, copy=True):
2057
"""
2058
Return a NumPy version of this time series.
2059
2060
.. NOTE::
2061
2062
If copy is ``False``, return a NumPy 1-D array reference to
2063
exactly the same block of memory as this time series. This is
2064
very, very fast and makes it easy to quickly use all
2065
NumPy/SciPy functionality on ``self``. However, it is dangerous
2066
because when this time series goes out of scope and is garbage
2067
collected, the corresponding NumPy reference object will point
2068
to garbage.
2069
2070
INPUT:
2071
2072
- ``copy`` -- bool (default: ``True``)
2073
2074
OUTPUT:
2075
2076
A numpy 1-D array.
2077
2078
EXAMPLES::
2079
2080
sage: v = finance.TimeSeries([1,-3,4.5,-2])
2081
sage: w = v.numpy(copy=False); w
2082
array([ 1. , -3. , 4.5, -2. ])
2083
sage: type(w)
2084
<type 'numpy.ndarray'>
2085
sage: w.shape
2086
(4,)
2087
2088
Notice that changing ``w`` also changes ``v`` too! ::
2089
2090
sage: w[0] = 20
2091
sage: w
2092
array([ 20. , -3. , 4.5, -2. ])
2093
sage: v
2094
[20.0000, -3.0000, 4.5000, -2.0000]
2095
2096
If you want a separate copy do not give the ``copy=False`` option. ::
2097
2098
sage: z = v.numpy(); z
2099
array([ 20. , -3. , 4.5, -2. ])
2100
sage: z[0] = -10
2101
sage: v
2102
[20.0000, -3.0000, 4.5000, -2.0000]
2103
"""
2104
cnumpy.import_array() #This must be called before using the numpy C/api or you will get segfault
2105
cdef cnumpy.npy_intp dims[1]
2106
dims[0] = self._length
2107
cdef cnumpy.ndarray n = cnumpy.PyArray_SimpleNewFromData(1, dims, cnumpy.NPY_DOUBLE, self._values)
2108
if copy:
2109
return n.copy()
2110
else:
2111
# Py_INCREF(self)
2112
# n.base = self
2113
return n
2114
2115
def randomize(self, distribution='uniform', loc=0, scale=1, **kwds):
2116
r"""
2117
Randomize the entries in this time series, and return a reference
2118
to ``self``. Thus this function both changes ``self`` in place, and
2119
returns a copy of ``self``, for convenience.
2120
2121
INPUT:
2122
2123
- ``distribution`` -- (default: ``"uniform"``); supported values are:
2124
2125
- ``'uniform'`` -- from ``loc`` to ``loc + scale``
2126
2127
- ``'normal'`` -- mean ``loc`` and standard deviation ``scale``
2128
2129
- ``'semicircle'`` -- with center at ``loc`` (``scale`` is ignored)
2130
2131
- ``'lognormal'`` -- mean ``loc`` and standard deviation ``scale``
2132
2133
- ``loc`` -- float (default: 0)
2134
2135
- ``scale`` -- float (default: 1)
2136
2137
.. NOTE::
2138
2139
All random numbers are generated using algorithms that
2140
build on the high quality GMP random number function
2141
gmp_urandomb_ui. Thus this function fully respects the Sage
2142
``set_random_state`` command. It's not quite as fast as the C
2143
library random number generator, but is of course much better
2144
quality, and is platform independent.
2145
2146
EXAMPLES:
2147
2148
We generate 5 uniform random numbers in the interval [0,1]::
2149
2150
sage: set_random_seed(0)
2151
sage: finance.TimeSeries(5).randomize()
2152
[0.8685, 0.2816, 0.0229, 0.1456, 0.7314]
2153
2154
We generate 5 uniform random numbers from 5 to `5+2=7`::
2155
2156
sage: set_random_seed(0)
2157
sage: finance.TimeSeries(10).randomize('uniform', 5, 2)
2158
[6.7369, 5.5632, 5.0459, 5.2911, 6.4628, 5.2412, 5.2010, 5.2761, 5.5813, 5.5439]
2159
2160
We generate 5 normal random values with mean 0 and variance 1. ::
2161
2162
sage: set_random_seed(0)
2163
sage: finance.TimeSeries(5).randomize('normal')
2164
[0.6767, -0.4011, 0.3576, -0.5864, -0.9365]
2165
2166
We generate 10 normal random values with mean 5 and variance 2. ::
2167
2168
sage: set_random_seed(0)
2169
sage: finance.TimeSeries(10).randomize('normal', 5, 2)
2170
[6.3534, 4.1978, 5.7153, 3.8273, 3.1269, 2.9598, 3.7484, 6.7472, 3.8986, 4.6271]
2171
2172
We generate 5 values using the semicircle distribution. ::
2173
2174
sage: set_random_seed(0)
2175
sage: finance.TimeSeries(5).randomize('semicircle')
2176
[0.7369, -0.9541, 0.4628, -0.7990, -0.4187]
2177
2178
We generate 1 million normal random values and create a frequency
2179
histogram. ::
2180
2181
sage: set_random_seed(0)
2182
sage: a = finance.TimeSeries(10^6).randomize('normal')
2183
sage: a.histogram(10)[0]
2184
[36, 1148, 19604, 130699, 340054, 347870, 137953, 21290, 1311, 35]
2185
2186
We take the above values, and compute the proportion that lie within
2187
1, 2, 3, 5, and 6 standard deviations of the mean (0)::
2188
2189
sage: s = a.standard_deviation()
2190
sage: len(a.clip_remove(-s,s))/float(len(a))
2191
0.683094
2192
sage: len(a.clip_remove(-2*s,2*s))/float(len(a))
2193
0.954559
2194
sage: len(a.clip_remove(-3*s,3*s))/float(len(a))
2195
0.997228
2196
sage: len(a.clip_remove(-5*s,5*s))/float(len(a))
2197
0.999998
2198
2199
There were no "six sigma events"::
2200
2201
sage: len(a.clip_remove(-6*s,6*s))/float(len(a))
2202
1.0
2203
"""
2204
if distribution == 'uniform':
2205
self._randomize_uniform(loc, loc + scale)
2206
elif distribution == 'normal':
2207
self._randomize_normal(loc, scale)
2208
elif distribution == 'semicircle':
2209
self._randomize_semicircle(loc)
2210
elif distribution == 'lognormal':
2211
self._randomize_lognormal(loc, scale)
2212
else:
2213
raise NotImplementedError
2214
return self
2215
2216
def _randomize_uniform(self, double left, double right):
2217
"""
2218
Generates a uniform random distribution of doubles between ``left`` and
2219
``right`` and stores values in place.
2220
2221
INPUT:
2222
2223
- ``left`` -- left bound on random distribution.
2224
2225
- ``right`` -- right bound on random distribution.
2226
2227
EXAMPLES:
2228
2229
We generate 5 values distributed with respect to the uniform
2230
distribution over the interval [0,1]::
2231
2232
sage: v = finance.TimeSeries(5)
2233
sage: set_random_seed(0)
2234
sage: v.randomize('uniform')
2235
[0.8685, 0.2816, 0.0229, 0.1456, 0.7314]
2236
2237
We now test that the mean is indeed 0.5::
2238
2239
sage: v = finance.TimeSeries(10^6)
2240
sage: set_random_seed(0)
2241
sage: v.randomize('uniform').mean()
2242
0.50069085...
2243
"""
2244
if left >= right:
2245
raise ValueError, "left must be less than right"
2246
2247
cdef randstate rstate = current_randstate()
2248
cdef Py_ssize_t k
2249
cdef double d = right - left
2250
for k from 0 <= k < self._length:
2251
self._values[k] = rstate.c_rand_double() * d + left
2252
2253
def _randomize_normal(self, double m, double s):
2254
"""
2255
Generates a normal random distribution of doubles with mean ``m`` and
2256
standard deviation ``s`` and stores values in place. Uses the
2257
Box-Muller algorithm.
2258
2259
INPUT:
2260
2261
- ``m`` -- mean
2262
2263
- ``s`` -- standard deviation
2264
2265
EXAMPLES:
2266
2267
We generate 5 values distributed with respect to the normal
2268
distribution with mean 0 and standard deviation 1::
2269
2270
sage: set_random_seed(0)
2271
sage: v = finance.TimeSeries(5)
2272
sage: v.randomize('normal')
2273
[0.6767, -0.4011, 0.3576, -0.5864, -0.9365]
2274
2275
We now test that the mean is indeed 0::
2276
2277
sage: set_random_seed(0)
2278
sage: v = finance.TimeSeries(10^6)
2279
sage: v.randomize('normal').mean()
2280
6.2705472723...
2281
2282
The same test with mean equal to 2 and standard deviation equal
2283
to 5::
2284
2285
sage: set_random_seed(0)
2286
sage: v = finance.TimeSeries(10^6)
2287
sage: v.randomize('normal', 2, 5).mean()
2288
2.0003135273...
2289
"""
2290
# Ported from http://users.tkk.fi/~nbeijar/soft/terrain/source_o2/boxmuller.c
2291
# This the box muller algorithm.
2292
cdef randstate rstate = current_randstate()
2293
2294
cdef double x1, x2, w, y1, y2
2295
cdef Py_ssize_t k
2296
for k from 0 <= k < self._length:
2297
while 1:
2298
x1 = 2*rstate.c_rand_double() - 1
2299
x2 = 2*rstate.c_rand_double() - 1
2300
w = x1*x1 + x2*x2
2301
if w < 1: break
2302
w = sqrt( (-2*log(w))/w )
2303
y1 = x1 * w
2304
y2 = x2 * w
2305
self._values[k] = m + y1*s
2306
k += 1
2307
if k < self._length:
2308
self._values[k] = m + y2*s
2309
2310
def _randomize_semicircle(self, double center):
2311
"""
2312
Generates a semicircle random distribution of doubles about center
2313
and stores values in place. Uses the acceptance-rejection method.
2314
2315
INPUT:
2316
2317
- ``center`` -- the center of the semicircle distribution.
2318
2319
EXAMPLES:
2320
2321
We generate 5 values distributed with respect to the semicircle
2322
distribution located at center::
2323
2324
sage: v = finance.TimeSeries(5)
2325
sage: set_random_seed(0)
2326
sage: v.randomize('semicircle')
2327
[0.7369, -0.9541, 0.4628, -0.7990, -0.4187]
2328
2329
We now test that the mean is indeed the center::
2330
2331
sage: v = finance.TimeSeries(10^6)
2332
sage: set_random_seed(0)
2333
sage: v.randomize('semicircle').mean()
2334
0.0007207497...
2335
2336
The same test with center equal to 2::
2337
2338
sage: v = finance.TimeSeries(10^6)
2339
sage: set_random_seed(0)
2340
sage: v.randomize('semicircle', 2).mean()
2341
2.0007207497...
2342
"""
2343
cdef Py_ssize_t k
2344
cdef double x, y, s, d = 2, left = center - 1, z
2345
cdef randstate rstate = current_randstate()
2346
z = d*d
2347
s = 1.5707963267948966192 # pi/2
2348
for k from 0 <= k < self._length:
2349
while 1:
2350
x = rstate.c_rand_double() * d - 1
2351
y = rstate.c_rand_double() * s
2352
if y*y + x*x < 1:
2353
break
2354
self._values[k] = x + center
2355
2356
def _randomize_lognormal(self, double m, double s):
2357
r"""
2358
Generates a log-normal random distribution of doubles with mean ``m``
2359
and standard deviation ``s``. Uses Box-Muller algorithm and the
2360
identity: if `Y` is a random variable with normal distribution then
2361
`X = \exp(Y)` is a random variable with log-normal distribution.
2362
2363
INPUT:
2364
2365
- ``m`` -- mean
2366
2367
- ``s`` -- standard deviation
2368
2369
EXAMPLES:
2370
2371
We generate 5 values distributed with respect to the lognormal
2372
distribution with mean 0 and standard deviation 1::
2373
2374
sage: set_random_seed(0)
2375
sage: v = finance.TimeSeries(5)
2376
sage: v.randomize('lognormal')
2377
[1.9674, 0.6696, 1.4299, 0.5563, 0.3920]
2378
2379
We now test that the mean is indeed `\sqrt{e}`::
2380
2381
sage: set_random_seed(0)
2382
sage: v = finance.TimeSeries(10^6)
2383
sage: v.randomize('lognormal').mean()
2384
1.647351973...
2385
sage: exp(0.5)
2386
1.648721270...
2387
2388
A log-normal distribution can be simply thought of as the logarithm
2389
of a normally distributed data set. We test that here by generating
2390
5 values distributed with respect to the normal distribution with mean
2391
0 and standard deviation 1::
2392
2393
sage: set_random_seed(0)
2394
sage: w = finance.TimeSeries(5)
2395
sage: w.randomize('normal')
2396
[0.6767, -0.4011, 0.3576, -0.5864, -0.9365]
2397
sage: exp(w)
2398
[1.9674, 0.6696, 1.4299, 0.5563, 0.3920]
2399
"""
2400
# Ported from http://users.tkk.fi/~nbeijar/soft/terrain/source_o2/boxmuller.c
2401
# This the box muller algorithm.
2402
cdef randstate rstate = current_randstate()
2403
cdef double x1, x2, w, y1, y2
2404
cdef Py_ssize_t k
2405
for k from 0 <= k < self._length:
2406
while 1:
2407
x1 = 2*rstate.c_rand_double() - 1
2408
x2 = 2*rstate.c_rand_double() - 1
2409
w = x1*x1 + x2*x2
2410
if w < 1: break
2411
w = sqrt( (-2*log(w))/w )
2412
y1 = x1 * w
2413
y2 = x2 * w
2414
self._values[k] = exp(m + y1*s)
2415
k += 1
2416
if k < self._length:
2417
self._values[k] = exp(m + y2*s)
2418
2419
def fft(self, bint overwrite=False):
2420
r"""
2421
Return the real discrete fast Fourier transform of ``self``, as a
2422
real time series:
2423
2424
.. MATH::
2425
2426
[y(0),\Re(y(1)),\Im(y(1)),\dots,\Re(y(n/2))] \text{ if $n$ is even}
2427
2428
[y(0),\Re(y(1)),\Im(y(1)),\dots,\Re(y(n/2)),\Im(y(n/2))] \text{ if $n$ is odd}
2429
2430
where
2431
2432
.. MATH::
2433
2434
y(j) = \sum_{k=0}^{n-1} x[k] \cdot \exp(-\sqrt{-1} \cdot jk \cdot 2\pi/n)
2435
2436
for `j = 0,\dots,n-1`. Note that `y(-j) = y(n-j)`.
2437
2438
EXAMPLES::
2439
2440
sage: v = finance.TimeSeries([1..9]); v
2441
[1.0000, 2.0000, 3.0000, 4.0000, 5.0000, 6.0000, 7.0000, 8.0000, 9.0000]
2442
sage: w = v.fft(); w
2443
[45.0000, -4.5000, 12.3636, -4.5000, 5.3629, -4.5000, 2.5981, -4.5000, 0.7935]
2444
2445
We get just the series of real parts of ::
2446
2447
sage: finance.TimeSeries([w[0]]) + w[1:].scale_time(2)
2448
[45.0000, -4.5000, -4.5000, -4.5000, -4.5000]
2449
2450
An example with an even number of terms::
2451
2452
sage: v = finance.TimeSeries([1..10]); v
2453
[1.0000, 2.0000, 3.0000, 4.0000, 5.0000, 6.0000, 7.0000, 8.0000, 9.0000, 10.0000]
2454
sage: w = v.fft(); w
2455
[55.0000, -5.0000, 15.3884, -5.0000, 6.8819, -5.0000, 3.6327, -5.0000, 1.6246, -5.0000]
2456
sage: v.fft().ifft()
2457
[1.0000, 2.0000, 3.0000, 4.0000, 5.0000, 6.0000, 7.0000, 8.0000, 9.0000, 10.0000]
2458
"""
2459
import scipy.fftpack
2460
if overwrite:
2461
y = self
2462
else:
2463
y = self.__copy__()
2464
w = y.numpy(copy=False)
2465
scipy.fftpack.rfft(w, overwrite_x=1)
2466
return y
2467
2468
def ifft(self, bint overwrite=False):
2469
r"""
2470
Return the real discrete inverse fast Fourier transform of
2471
``self``, which is also a real time series.
2472
2473
This is the inverse of ``fft()``.
2474
2475
The returned real array contains
2476
2477
.. MATH::
2478
2479
[y(0),y(1),\dots,y(n-1)]
2480
2481
where for `n` is even
2482
2483
.. MATH::
2484
2485
y(j)
2486
=
2487
1/n \left(
2488
\sum_{k=1}^{n/2-1}
2489
(x[2k-1]+\sqrt{-1} \cdot x[2k])
2490
\cdot \exp(\sqrt{-1} \cdot jk \cdot 2pi/n)
2491
+ c.c. + x[0] + (-1)^j x[n-1]
2492
\right)
2493
2494
and for `n` is odd
2495
2496
.. MATH::
2497
2498
y(j)
2499
=
2500
1/n \left(
2501
\sum_{k=1}^{(n-1)/2}
2502
(x[2k-1]+\sqrt{-1} \cdot x[2k])
2503
\cdot \exp(\sqrt{-1} \cdot jk \cdot 2pi/n)
2504
+ c.c. + x[0]
2505
\right)
2506
2507
where `c.c.` denotes complex conjugate of preceding expression.
2508
2509
EXAMPLES::
2510
2511
sage: v = finance.TimeSeries([1..10]); v
2512
[1.0000, 2.0000, 3.0000, 4.0000, 5.0000, 6.0000, 7.0000, 8.0000, 9.0000, 10.0000]
2513
sage: v.ifft()
2514
[5.1000, -5.6876, 1.4764, -1.0774, 0.4249, -0.1000, -0.2249, 0.6663, -1.2764, 1.6988]
2515
sage: v.ifft().fft()
2516
[1.0000, 2.0000, 3.0000, 4.0000, 5.0000, 6.0000, 7.0000, 8.0000, 9.0000, 10.0000]
2517
"""
2518
import scipy.fftpack
2519
if overwrite:
2520
y = self
2521
else:
2522
y = self.__copy__()
2523
w = y.numpy(copy=False)
2524
scipy.fftpack.irfft(w, overwrite_x=1)
2525
return y
2526
2527
2528
cdef new_time_series(Py_ssize_t length):
2529
"""
2530
Return a new uninitialized time series of the given length.
2531
The entries of the time series are garbage.
2532
2533
INPUT:
2534
2535
- ``length`` -- integer
2536
2537
OUTPUT:
2538
2539
A time series.
2540
2541
EXAMPLES:
2542
2543
This uses ``new_time_series`` implicitly::
2544
2545
sage: v = finance.TimeSeries([1,-3,4.5,-2])
2546
sage: v.__copy__()
2547
[1.0000, -3.0000, 4.5000, -2.0000]
2548
"""
2549
if length < 0:
2550
raise ValueError, "length must be nonnegative"
2551
cdef TimeSeries t = PY_NEW(TimeSeries)
2552
t._length = length
2553
t._values = <double*> sage_malloc(sizeof(double)*length)
2554
return t
2555
2556
def unpickle_time_series_v1(v, Py_ssize_t n):
2557
"""
2558
Version 1 unpickle method.
2559
2560
INPUT:
2561
2562
- ``v`` -- a raw char buffer
2563
2564
EXAMPLES::
2565
2566
sage: v = finance.TimeSeries([1,2,3])
2567
sage: s = v.__reduce__()[1][0]
2568
sage: type(s)
2569
<type 'str'>
2570
sage: sage.finance.time_series.unpickle_time_series_v1(s,3)
2571
[1.0000, 2.0000, 3.0000]
2572
sage: sage.finance.time_series.unpickle_time_series_v1(s+s,6)
2573
[1.0000, 2.0000, 3.0000, 1.0000, 2.0000, 3.0000]
2574
sage: sage.finance.time_series.unpickle_time_series_v1('',0)
2575
[]
2576
"""
2577
cdef TimeSeries t = new_time_series(n)
2578
memcpy(t._values, PyString_AsString(v), n*sizeof(double))
2579
return t
2580
2581
2582
2583
2584
def autoregressive_fit(acvs):
2585
r"""
2586
Given a sequence of lagged autocovariances of length `M` produce
2587
`a_1,\dots,a_p` so that the first `M` autocovariance coefficients
2588
of the autoregressive processes `X_t=a_1X_{t_1}+\cdots+a_pX_{t-p}+Z_t`
2589
are the same as the input sequence.
2590
2591
The function works by solving the Yule-Walker equations
2592
`\Gamma a =\gamma`, where `\gamma=(\gamma(1),\dots,\gamma(M))`,
2593
`a=(a_1,\dots,a_M)`, with `\gamma(i)` the autocovariance of lag `i`
2594
and `\Gamma_{ij}=\gamma(i-j)`.
2595
2596
EXAMPLES:
2597
2598
In this example we consider the multifractal cascade random walk
2599
of length 1000, and use simulations to estimate the
2600
expected first few autocovariance parameters for this model, then
2601
use them to construct a linear filter that works vastly better
2602
than a linear filter constructed from the same data but not using
2603
this model. The Monte-Carlo method illustrated below should work for
2604
predicting one "time step" into the future for any
2605
model that can be simulated. To predict k time steps into the
2606
future would require using a similar technique but would require
2607
scaling time by k.
2608
2609
We create 100 simulations of a multifractal random walk. This
2610
models the logarithms of a stock price sequence. ::
2611
2612
sage: set_random_seed(0)
2613
sage: y = finance.multifractal_cascade_random_walk_simulation(3700,0.02,0.01,0.01,1000,100)
2614
2615
For each walk below we replace the walk by the walk but where each
2616
step size is replaced by its absolute value -- this is what we
2617
expect to be able to predict given the model, which is only a
2618
model for predicting volatility. We compute the first 200
2619
autocovariance values for every random walk::
2620
2621
sage: c = [[a.diffs().abs().sums().autocovariance(i) for a in y] for i in range(200)]
2622
2623
We make a time series out of the expected values of the
2624
autocovariances::
2625
2626
sage: ac = finance.TimeSeries([finance.TimeSeries(z).mean() for z in c])
2627
sage: ac
2628
[3.9962, 3.9842, 3.9722, 3.9601, 3.9481 ... 1.7144, 1.7033, 1.6922, 1.6812, 1.6701]
2629
2630
.. NOTE::
2631
2632
``ac`` looks like a line -- one could best fit it to yield a lot
2633
more approximate autocovariances.
2634
2635
We compute the autoregression coefficients matching the above
2636
autocovariances::
2637
2638
sage: F = finance.autoregressive_fit(ac); F
2639
[0.9982, -0.0002, -0.0002, 0.0003, 0.0001 ... 0.0002, -0.0002, -0.0000, -0.0002, -0.0014]
2640
2641
Note that the sum is close to 1::
2642
2643
sage: sum(F)
2644
0.99593284089454...
2645
2646
Now we make up an 'out of sample' sequence::
2647
2648
sage: y2 = finance.multifractal_cascade_random_walk_simulation(3700,0.02,0.01,0.01,1000,1)[0].diffs().abs().sums()
2649
sage: y2
2650
[0.0013, 0.0059, 0.0066, 0.0068, 0.0184 ... 6.8004, 6.8009, 6.8063, 6.8090, 6.8339]
2651
2652
And we forecast the very last value using our linear filter; the forecast
2653
is close::
2654
2655
sage: y2[:-1].autoregressive_forecast(F)
2656
6.7836741372407...
2657
2658
In fact it is closer than we would get by forecasting using a
2659
linear filter made from all the autocovariances of our sequence::
2660
2661
sage: y2[:-1].autoregressive_forecast(y2[:-1].autoregressive_fit(len(y2)))
2662
6.770168705668...
2663
2664
We record the last 20 forecasts, always using all correct values up to the
2665
one we are forecasting::
2666
2667
sage: s1 = sum([(y2[:-i].autoregressive_forecast(F)-y2[-i])^2 for i in range(1,20)])
2668
2669
We do the same, but using the autocovariances of the sample sequence::
2670
2671
sage: F2 = y2[:-100].autoregressive_fit(len(F))
2672
sage: s2 = sum([(y2[:-i].autoregressive_forecast(F2)-y2[-i])^2 for i in range(1,20)])
2673
2674
Our model gives us something that is 15 percent better in this case::
2675
2676
sage: s2/s1
2677
1.15464636102...
2678
2679
How does it compare overall? To find out we do 100 simulations
2680
and for each we compute the percent that our model beats naively
2681
using the autocovariances of the sample::
2682
2683
sage: y_out = finance.multifractal_cascade_random_walk_simulation(3700,0.02,0.01,0.01,1000,100)
2684
sage: s1 = []; s2 = []
2685
sage: for v in y_out:
2686
... s1.append(sum([(v[:-i].autoregressive_forecast(F)-v[-i])^2 for i in range(1,20)]))
2687
... F2 = v[:-len(F)].autoregressive_fit(len(F))
2688
... s2.append(sum([(v[:-i].autoregressive_forecast(F2)-v[-i])^2 for i in range(1,20)]))
2689
...
2690
2691
We find that overall the model beats naive linear forecasting by 35
2692
percent! ::
2693
2694
sage: s = finance.TimeSeries([s2[i]/s1[i] for i in range(len(s1))])
2695
sage: s.mean()
2696
1.354073591877...
2697
"""
2698
cdef TimeSeries c
2699
cdef Py_ssize_t i
2700
2701
M = len(acvs)-1
2702
2703
if M <= 0:
2704
raise ValueError, "M must be positive"
2705
2706
if not isinstance(acvs, TimeSeries):
2707
c = TimeSeries(acvs)
2708
else:
2709
c = acvs
2710
2711
# Also create the numpy vector v with entries c(1), ..., c(M).
2712
v = c[1:].numpy()
2713
2714
# 2. Create the autocovariances numpy 2d array A whose i,j entry
2715
# is c(|i-j|).
2716
import numpy
2717
A = numpy.array([[c[abs(j-k)] for j in range(M)] for k in range(M)])
2718
2719
# 3. Solve the equation A * x = v
2720
x = numpy.linalg.solve(A, v)
2721
2722
# 4. The entries of x give the linear filter coefficients.
2723
return TimeSeries(x)
2724
2725