CoCalc -- Worksheet_Feb13_due

⁴⁹⁸ views

Kernel: Python 3 (Ubuntu Linux)

Exercises

In [21]:

#1 
import numpy as np
dataset = np.arange(1,31,2)
dataset = dataset.reshape(5,3)
print(dataset)

Out[21]:

[[ 1  3  5]
 [ 7  9 11]
 [13 15 17]
 [19 21 23]
 [25 27 29]]

In [22]:

#2 
dataset[0:2,0:2]

Out[22]:

array([[1, 3],
       [7, 9]])

In [23]:

#3
import string

x = string.ascii_lowercase
L = np.array(list(x))

print(L)

Out[23]:

['a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j' 'k' 'l' 'm' 'n' 'o' 'p' 'q' 'r'
 's' 't' 'u' 'v' 'w' 'x' 'y' 'z']

In [0]:

In [24]:

#4
import numpy as np

numbas = list(range(1,27))
rands = np.random.rand(26)
mask = rands < .5
print(L[mask])

Out[24]:

['c' 'e' 'f' 'g' 'i' 'k' 'l' 'm' 'o' 'q' 'r' 'u' 'v' 'x' 'z']

In [25]:

# 5
import numpy as np #adds module numpy as np
import matplotlib.pyplot as plt #adds  pyplot as plt

x1 = np.linspace(0.0, 5.0) # creates 1D array with evenly spaced numbers between 0.0,5.0 divided into 50 sections
x2 = np.linspace(0.0, 2.0)# creates 1D array with evenly spaced numbers between 0.0,2.0 divided into 50 sections

y1 = np.cos(2*np.pi*x1) #finds the cos(2pi) for every x in x1
np.exp(-x1)#calculates e^-x for each x in x1
y2 = np.cos(2*np.pi*x2) #finds the cos(2pi) for each x in x2

print(plt.subplot(2, 1, 1)) #creates subplot at the top, divided by 2

plt.plot(x1, y1, 'o-')  #plots x1, x2 with circle markey
plt.title('A tale of 2 subplots') #titles the major plots
plt.ylabel('Damped oscillation') #Y axis label for subplot 1

plt.subplot(2, 1, 2) #creates subplot at the bottom
plt.plot(x2, y2, '.-') #creates plot with solid line style
plt.xlabel('time (s)') #labels x axis for subplot
plt.ylabel('Undamped') #labels y axis for subplot

plt.show() #shows plots as 1 image object

Out[25]:

AxesSubplot(0.125,0.536818;0.775x0.343182)

In [26]:

#6 6. Read into a numpy array the data set "HIP_star.dat". This data set contains information about stars; we are only interested in columns "Vmag", 'Plx', and "B-V", so make sure you only read those. Evaluate whether you need to skip rows. Hint: open the file with a text editor first.

values = np.genfromtxt('HIP_star.dat',usecols = (1,4,8),skip_header=True)
vmag = np.genfromtxt('HIP_star.dat', usecols = 1, skip_header=True)
plx = np.genfromtxt('HIP_star.dat', usecols = 4, skip_header = True)
bv = np.genfromtxt('HIP_star.dat',usecols = 8, skip_header = True)

In [27]:

#7
values.shape

Out[27]:

(2678, 3)

In [28]:

print(plx)

Out[28]:

[21.9  23.84 24.45 ... 22.91 22.19 24.63]

In [29]:

#8/9
#Define a function, logL, that calculates the (log of) a star's luminosity starting from Vmag (the visual brightness) and Plx (the parallax angle, which is inversely proportional to a star's distance). The function should implement the following relationship:

def logL(vmag,plx):
    answer = (15 - vmag - 5 * np.log10(plx))/2.5
    return answer

In [30]:

lum = logL(vmag,plx)

In [31]:

10\. Make a scatter plot that has B-V for the stars on the x axis (make sure you index your array correctly!), and their log luminosity (expressed by the function you just made) on the y axis. This is called a H-R diagram (Hertzsprung Russell diagram). It encodes information about the temperature of stars (expressed by color, or B-V) and their luminosity. 

Add axes titles: B-V for the x axis, Log L (as appropriate) for y axis

Adjust axes limits to be [-0.5,3] for B-V, [-3, 4] for log L

Out[31]:

  File "<ipython-input-31-c5ae1e2cf5d0>", line 1
    10\. Make a scatter plot that has B-V for the stars on the x axis (make sure you index your array correctly!), and their log luminosity (expressed by the function you just made) on the y axis. This is called a H-R diagram (Hertzsprung Russell diagram). It encodes information about the temperature of stars (expressed by color, or B-V) and their luminosity.
                                                                                                                                                                                                                                                                                                                                                                          ^
SyntaxError: unexpected character after line continuation character

In [32]:

import matplotlib as mpl
import matplotlib.pyplot as plt


scat = plt.scatter(bv,lum)
plt.title("H-R Diagram")
plt.xlabel("B-V")
plt.ylabel("Luminosity")
plt.xlim(-.5,3)
plt.ylim(-3,4);

Out[32]:

In [48]:

#Vmag = Color

scat_color = plt.scatter(bv,lum,c=vmag)
plt.title("H-R Diagram")
plt.xlabel("B-V")
plt.ylabel("Luminosity")
plt.xlim(-.5,3)
plt.ylim(-3,4);

Out[48]:

In [35]:

#Vmag = Size

scat_size = plt.scatter(bv,lum,s=vmag)
plt.title("H-R Diagram")
plt.xlabel("B-V")
plt.ylabel("Luminosity")
plt.xlim(-.5,3)
plt.ylim(-3,4);

Out[35]:

In [62]:

#12\. Visualize B-V as a histogram.

plt.hist(bv, bins=100)

Out[62]:

(array([  2.,   2.,   1.,   4.,   6.,   8.,  12.,   7.,   1.,  12.,  12.,
         16.,  17.,  20.,  13.,  24.,  18.,  27.,  48.,  50.,  67.,  82.,
         95.,  99., 126., 113., 145., 156., 105., 111.,  94.,  89.,  87.,
         75.,  71.,  86.,  72.,  69.,  56.,  64.,  47.,  48.,  56.,  23.,
         39.,  36.,  34.,  33.,  26.,  27.,  23.,  19.,  22.,  15.,  12.,
          5.,   8.,   5.,  14.,  11.,   2.,   1.,   3.,   0.,   0.,   0.,
          1.,   0.,   0.,   3.,   0.,   0.,   1.,   0.,   0.,   0.,   0.,
          0.,   0.,   0.,   0.,   1.,   0.,   0.,   0.,   0.,   0.,   0.,
          0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
          1.]),
 array([-0.158  , -0.12842, -0.09884, -0.06926, -0.03968, -0.0101 ,
         0.01948,  0.04906,  0.07864,  0.10822,  0.1378 ,  0.16738,
         0.19696,  0.22654,  0.25612,  0.2857 ,  0.31528,  0.34486,
         0.37444,  0.40402,  0.4336 ,  0.46318,  0.49276,  0.52234,
         0.55192,  0.5815 ,  0.61108,  0.64066,  0.67024,  0.69982,
         0.7294 ,  0.75898,  0.78856,  0.81814,  0.84772,  0.8773 ,
         0.90688,  0.93646,  0.96604,  0.99562,  1.0252 ,  1.05478,
         1.08436,  1.11394,  1.14352,  1.1731 ,  1.20268,  1.23226,
         1.26184,  1.29142,  1.321  ,  1.35058,  1.38016,  1.40974,
         1.43932,  1.4689 ,  1.49848,  1.52806,  1.55764,  1.58722,
         1.6168 ,  1.64638,  1.67596,  1.70554,  1.73512,  1.7647 ,
         1.79428,  1.82386,  1.85344,  1.88302,  1.9126 ,  1.94218,
         1.97176,  2.00134,  2.03092,  2.0605 ,  2.09008,  2.11966,
         2.14924,  2.17882,  2.2084 ,  2.23798,  2.26756,  2.29714,
         2.32672,  2.3563 ,  2.38588,  2.41546,  2.44504,  2.47462,
         2.5042 ,  2.53378,  2.56336,  2.59294,  2.62252,  2.6521 ,
         2.68168,  2.71126,  2.74084,  2.77042,  2.8    ]),
 <a list of 100 Patch objects>)

In [60]:

#13\. For the B-V array, calculate its mean and its median, and its standard deviation.

mean = np.mean(bv)
median = np.median(bv)
std = np.std(bv)

print(mean,median,std)

Out[60]:

0.7615298730395818 0.7104999999999999 0.31812819990080615

In [61]:

#Extra points:

#14\. Write down the 10 and 90 percentile levels and explain what they are. 

print("The smallest ten percent of the stars in this dataset have a B-V value less than or equal to " , round(np.percentile(bv,10),2))
print("The smallest ninety percent of the stars in this dataset have a B-V value less than or equal to " , round(np.percentile(bv,90),2))

Out[61]:

The smallest ten percent of the stars in this dataset have a B-V value less than or equal to  0.42
The smallest ninety percent of the stars in this dataset have a B-V value less than or equal to  1.2

In [0]:

15\. Are the mean and the median different? In either case, what does this tell us about the distribution?

In this case, the mean and the median are different. All we can say for sure from that information is that it is skewed. While there is a rule of thumb about the direction of skewness, this paper explains why that is an unsafe assumption. However, from the histogram, we can clearly see that the data skews right (has a positive skew).

In [0]:

Exercises

Product

Resources

Company