Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
suyashi29
GitHub Repository: suyashi29/python-su
Path: blob/master/Time Forecasting using Python/1.3 Practice Questions based on moving Average.ipynb
3074 views
Kernel: Python 3 (ipykernel)

1. Determine the moving average

  • Determine the moving average with a window size of 2 for the list [30, 29, 35, 33, 31, 37, 39, 40, 41, 45].

  • Using the list [100, 95, 90, 85, 80, 75, 70, 65, 60, 55], calculate the moving average with a window size of 6.

  • Find the moving average for the list of integers [4, 8, 12, 16, 20, 24, 28, 32, 36, 40] with a window size of 3.

  • Calculate the moving average using a window size of 4 for the list [5, 10, 15, 20, 25, 30, 35, 40, 45, 50].

  • For the list [11, 22, 33, 44, 55, 66, 77, 88, 99, 110], find the moving average with a window size of 2.

  • Determine the moving average with a window size of 3 for the list [9, 18, 27, 36, 45, 54, 63, 72, 81, 90].

import numpy as np # List of integers data_q7 = np.array([5, 10, 15, 20, 25, 30, 35, 40, 45, 50]) # Window size window_size_q7 = 4 # Calculate the moving average moving_avg_q7 = np.convolve(data_q7, np.ones(window_size_q7)/window_size_q7, mode='valid') print("Moving Average (Window Size 4) for Question 7:", moving_avg_q7)
Moving Average (Window Size 4) for Question 7: [12.5 17.5 22.5 27.5 32.5 37.5 42.5]

Objective:

2. Perform an Augmented Dickey-Fuller (ADF) test on a dataset containing attrition data to determine whether the attrition rate is stationary.

  • Dataset: You are provided with a dataset named attrition_data.csv that contains the following columns:

  • Date: The date of the recorded attrition rate (format: YYYY-MM-DD).

  • Attr_rate: The attrition rate for the corresponding date. Instructions:

  • Load the Data:

  • Import necessary libraries such as pandas, statsmodels, and matplotlib.

  • Load the dataset into a pandas DataFrame.

  • Ensure the Date column is parsed as datetime objects and set it as the index of the DataFrame.

  • Visualize the Data:

  • Plot the Attr_rate over time to visually inspect if the series appears to be stationary or non-stationary.

  • Perform the ADF Test:

  • Use the adfuller function from the statsmodels library to perform the ADF test on the Attr_rate series.

  • Extract and print the following from the test results:

  • ADF Statistic

  • p-value

  • Number of lags used

  • Number of observations used

  • Critical values for 1%, 5%, and 10% levels

Interpret the Results:

Based on the p-value and the critical values, determine whether to reject the null hypothesis of the ADF test. Discuss whether the attrition rate series is stationary or not.

# Step 1: Load the Data import pandas as pd import matplotlib.pyplot as plt from statsmodels.tsa.stattools import adfuller # Load the dataset data = pd.read_csv('attrition_data.csv') data.head()
# Step 1: Load the Data import pandas as pd import matplotlib.pyplot as plt from statsmodels.tsa.stattools import adfuller # Load the dataset data = pd.read_csv('attrition_data.csv') # Step 2: Visualize the Data plt.figure(figsize=(10, 6)) plt.plot(data['Attrition Rate'], label='Attrition Rate') plt.title('Attrition Rate Over Time') plt.xlabel('Date') plt.ylabel('Attrition Rate') plt.legend() plt.show() # Step 3: Perform the ADF Test adf_test = adfuller(data['Attrition Rate']) adf_statistic = adf_test[0] p_value = adf_test[1] used_lag = adf_test[2] n_obs = adf_test[3] critical_values = adf_test[4] print(f'ADF Statistic: {adf_statistic}') print(f'p-value: {p_value}') print(f'Number of lags used: {used_lag}') print(f'Number of observations used: {n_obs}') print('Critical Values:') for key, value in critical_values.items(): print(f' {key}: {value}') # Step 4: Interpret the Results if p_value < 0.05: print('Reject the null hypothesis: The series is stationary.') else: print('Fail to reject the null hypothesis: The series is non-stationary.')
Image in a Jupyter notebook
ADF Statistic: -0.10321535298059625 p-value: 0.9491173286548007 Number of lags used: 4 Number of observations used: 8 Critical Values: 1%: -4.6651863281249994 5%: -3.3671868750000002 10%: -2.802960625 Fail to reject the null hypothesis: The series is non-stationary.