ESBS

Biophysics practicals 2012

This SAGE notebook contains some hints and programs that will allow you to analyse the data recorded during the experiments. Exemples are provided to show you the more appropriate way to analyse and display your data. The graphics could be exported directly to fit within your reports as png files.

1. TP 2 Mass transport : Non-linear fitting of the time dependant evolution of the gel's temperature

The time evolution of the gel's temperature could be modelled using the following equation:

$T(t)=A+B\left(1-e^{-Ct}\right)$

where $A=T_{0}, B=\frac{P}{aC} and C=a$

First : Get an idea of the effect of each of the parameters on the curve using the following interactive plot

In particular, check which parameter influences 1) the value of the plateau 2) the initial slope

@interact
def _(A=(10..100),B=(1..10),CI=(1..20)):
    C=CI*0.001
    f(t) = A + B*(1-exp(-C*t))
    pf = plot(f,0,1000,axes_labels=['Time (s)','Temp (Deg C)'])
    show(pf)

Second : We should define the function to be fitted, in a way that will be recognized by NUMPY

from numpy import *
def temp(time,pA,pB,pC):
    """ Equation describing heat exchange within an electrophoresis gel"""
    y= pA+pB*(1 - exp(-time*pC))
    return y

Third : The measured values have to be entered in the program. First option : enter the values manually as shown below

Times = [10, 30, 100, 300, 400, 600, 800, 1200, 2000]
Texp =  [20.2, 20.5, 21.3, 22.9, 23.3, 23.7, 23.9, 24.0, 24.0]

Second option: download a text file named as you like (just set the right value for the fname variable), that contains the recorded temperatures according the following format:

Fri Mar 16 17:31:34 2012 23.56
Fri Mar 16 17:31:40 2012 23.5

And run the following script

...

# arduino prep
# Read Arduino data 
# B. Kieffer Mar 2012

import time 

# 1. Read data into a list all_line
fname = 'arduino.dat'

myfile = open(DATA+fname,'r')
all_lines = myfile.readlines()
myfile.close()

txtime = []
Texp = []
Times = []
# 2. Extract relevent informations from all_line
for line in all_lines:
	line_field = line.split()
	time_field = line_field[3]
	val = line_field[5]
	txtime.append(time_field)
	Texp.append(float(val))

# 3. Build a vector of time delays starting from the first point
first = True	
for t in txtime :
   t1 = time.strptime(t, "%H:%M:%S")
   t_sec = t1.tm_hour*3600+t1.tm_min*60+t1.tm_sec
   if first :
      Times.append(0)
      t0 = t_sec # First point is the time 0
      first = False
   else :
      Times.append(t_sec-t0)
print " %i lines read from file %s " % (len(Times),fname)

61 lines read from file arduino.dat

Texp

[20.0, 20.190000000000001, 20.5, 20.75, 21.059999999999999, 21.25, 21.5, 21.75, 21.870000000000001, 22.059999999999999, 22.25, 22.370000000000001, 22.559999999999999, 22.690000000000001, 22.75, 22.870000000000001, 22.940000000000001, 23.059999999999999, 23.190000000000001, 23.370000000000001, 23.440000000000001, 23.5, 23.690000000000001, 23.690000000000001, 23.75, 23.870000000000001, 23.940000000000001, 24.0, 24.059999999999999, 24.120000000000001, 24.309999999999999, 24.309999999999999, 24.309999999999999, 24.370000000000001, 24.370000000000001, 24.440000000000001, 24.620000000000001, 24.620000000000001, 24.620000000000001, 24.75, 24.75, 24.75, 24.809999999999999, 24.940000000000001, 25.0, 25.059999999999999, 25.0, 24.940000000000001, 25.0, 25.059999999999999, 25.059999999999999, 25.190000000000001, 25.25, 25.190000000000001, 25.25, 25.190000000000001, 25.120000000000001, 25.190000000000001, 25.190000000000001, 25.120000000000001, 25.25]

Times

[0, 31, 60, 91, 120, 150, 180, 210, 240, 270, 301, 330, 361, 390, 421, 450, 481, 510, 541, 570, 601, 630, 661, 690, 721, 750, 780, 810, 840, 870, 900, 931, 960, 991, 1020, 1051, 1080, 1111, 1140, 1171, 1200, 1231, 1260, 1291, 1320, 1351, 1380, 1410, 1440, 1470, 1500, 1530, 1560, 1590, 1621, 1650, 1681, 1710, 1741, 1770, 1801]

Finally : Perform the non-linear fit using the program curve_fit that is included in the optimize library of SCIPY

from scipy import optimize
# 1. Initialize parameters
Pini = [18., 2., 0.001] # Initial guess of the parameters
Pow = 10.0 # Power in the gel

# 2. Perform the fit
PP, pcov = optimize.curve_fit(temp, array(Times), array(Texp), Pini)

# 3. Report and plot the results
pA = PP[0] ; pB = PP[1] ; pC = PP[2]
Cap = Pow/(pB*pC)
dA = pcov[0][0]^0.5
dB = pcov[1][1]^0.5
dC = pcov[2][2]^0.5
dcpr = dB/pB + dC/pC
dCap = dcpr*Cap

print "Optimized parameters:"
print "T0: %f +/- %f (Deg C) " % (pA,dA)
print "Calorific capacity: %f +/- %f (J.K-1)" % (Cap,dCap)
print "Rate: %f +/- %f (s-1)" %(pC,dC)
a = scatter_plot(zip(Times,Texp))
a += plot(temp(x,PP[0],PP[1],PP[2]), 0,max(Times),axes_labels=['Time (s)','Temp (Deg C)'])
show(a)

Optimized parameters:
T0: 20.059797 +/- 0.032958 (Deg C) 
Calorific capacity: 1141.509034 +/- 28.951827 (J.K-1)
Rate: 0.001589 +/- 0.000030 (s-1)

Note on error estimates :

The program optimize.curve_fit returns, along with the optimized parameters, a covariance matrix on these parameters. The diagonal terms on this matrix provides you with the variance on the parameters. This variance is a good estimate of the uncertainties, provided that your model is correct and that the noise is random.

The standard deviation on the T0 (initial temperature) and a (heating rate) can be calculated directly from the variance of the first and last parameter (A and C)

dA = pcov[0][0]^0.5
dC = pcov[2][2]^0.5

The error on the Heat capacity value, however, should be obtained from a error propagation calculation, since its value results from two parameters B and C :

$\frac{\Delta{Cap}}{Cap} = \frac{\Delta{B}}{B}+\frac{\Delta{C}}{C}$

with $Cap = \frac{Pow}{BxC}$

B and C beeing the optimized parameters.

Assess the nature of the noise

It is useful to analyse the distribution of the experimental points around those calculated from the model. In particular, the shape of the distribution will help to assess the random or systematic nature of the noise.

import numpy as np
YE = Texp
YC = []
for ts in Times :
    YC.append(temp(ts,PP[0],PP[1],PP[2]))    
YE = array(YE)
YC = array(YC)
YD = list(YE-YC)
a,b = np.histogram(YD, bins=20)
print b
bar_chart(a)

[-0.13478857 -0.1211206  -0.10745263 -0.09378466 -0.08011669 -0.06644872
 -0.05278075 -0.03911278 -0.02544481 -0.01177684  0.00189113  0.01555911
  0.02922708  0.04289505  0.05656302  0.07023099  0.08389896  0.09756693
  0.1112349   0.12490287  0.13857084]