Skip to main content

Quadratic Regression

 Quadratic regression is a statistical method used to model a relationship between variables with a parabolic best-fit curve, rather than a straight line. It's ideal when the data relationship appears curvilinear. The goal is to fit a quadratic equation  

y=ax^2+bx+c
y=ax2+bx+c
to the observed data, providing a nuanced model of the relationship. Contrary to historical or biological connotations, "regression" in this mathematical context refers to advancing our understanding of complex relationships among variables, particularly when data follows a curvilinear pattern.

Working with quadratic regression

These calculations can become quite complex and tedious. We have just gone over a few very detailed formulas, but the truth is that we can handle these calculations with a graphing calculator. This saves us from having to go through so many steps -- but we still must understand the core concepts at play.

Let's try a practice problem that includes quadratic regression. Consider the following set of data: {(-3,7.5),(-2,3),(-1,0.5),(0,1)(1,3),(2,6),(3,14)}

-37.5-23-10.5011326314

Can we determine the quadratic regression for this set?

Our first step is to enter our x-coordinates and y-coordinates into our graphing calculator. We can then carry out our operation for a quadratic equation. This will give us the equation of the parabola that best approximates the points: y=1.1071x^2+x+0.5714

y=1.1071x2+x+0.5714

Great! Now all we need to do is plot our graph. We should be left with something like this:

We also know that our relative predictive power ( R2 ) is 0.9942. That's pretty accurate -- and it tells us that our calculations for quadratic regression worked!

##################################################################################

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures

# Load data from CSV file
data = pd.read_csv('C:/Users/srinu/Downloads/3-4_CSM_ML/Dataset/data1.csv')  # Replace 'data.csv' with your file name

# Extract the independent (X) and dependent (Y) variables
x = data['X'].values.reshape(-1, 1)
y = data['Y'].values

# Transform the data to include x^2
poly = PolynomialFeatures(degree=2)
x_poly = poly.fit_transform(x)

# Fit the model
model = LinearRegression()
model.fit(x_poly, y)

# Make predictions
y_pred = model.predict(x_poly)

# Plotting the results
plt.scatter(x, y, color='blue', label='Original data')
plt.plot(x, y_pred, color='red', label='Quadratic regression')
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Quadratic Regression')
plt.legend()
plt.show()

Data Set 

X,Y
1,1
2,4
3,9
4,16
5,25
6,36
7,49
8,64
9,81




# importing packages and modules 
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 
import seaborn as sns 
from sklearn.metrics import r2_score 
import scipy.stats as stats 

dataset = pd.read_csv('ball.csv') 
sns.scatterplot(data=dataset, x='time', 
y='height', hue='time') 

plt.title('time vs height of the ball') 
plt.xlabel('time') 
plt.ylabel('height') 
plt.show() 

# degree 2 polynomial fit or quadratic fit 
model = np.poly1d(np.polyfit(dataset['time'], 
dataset['height'], 2)) 

# polynomial line visualization 
polyline = np.linspace(0, 10, 100) 
plt.scatter(dataset['time'], dataset['height']) 
plt.plot(polyline, model(polyline)) 
plt.show() 

print(model) 

# r square metric 
print(r2_score(dataset['height'], 
model(dataset['time']))) 


Comments

Popular posts from this blog

Logistic Regression

Logistic regression is a statistical method used for binary classification problems. It's particularly useful when you need to predict the probability of a binary outcome based on one or more predictor variables. Here's a breakdown: What is Logistic Regression? Purpose : It models the probability of a binary outcome (e.g., yes/no, success/failure) using a logistic function (sigmoid function). Function : The logistic function maps predicted values (which are in a range from negative infinity to positive infinity) to a probability range between 0 and 1. Formula : The model is typically expressed as: P ( Y = 1 ∣ X ) = 1 1 + e − ( β 0 + β 1 X ) P(Y = 1 | X) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 X)}} P ( Y = 1∣ X ) = 1 + e − ( β 0 ​ + β 1 ​ X ) 1 ​ Where P ( Y = 1 ∣ X ) P(Y = 1 | X) P ( Y = 1∣ X ) is the probability of the outcome being 1 given predictor X X X , and β 0 \beta_0 β 0 ​ and β 1 \beta_1 β 1 ​ are coefficients estimated during model training. When to Apply Logistic R...

Linear Regression using Ordinary Least Square method

Ordinary Least Square Method Download Dataset Step 1: Import the necessary libraries import numpy as np import pandas as pd import matplotlib.pyplot as plt Step 2: Load the CSV Data # Load the dataset data = pd.read_csv('house_data.csv') # Extract the features (X) and target variable (y) X = data['Size'].values y = data['Price'].values # Reshape X to be a 2D array X = X.reshape(-1, 1) # Add a column of ones to X for the intercept X_b = np.c_[np.ones((X.shape[0], 1)), X] Step 3: Add a Column of Ones to X for the Intercept # Add a column of ones to X for the intercept X_b = np.c_[np.ones((X.shape[0], 1)), X] Step 4: Implement the OLS Method # Calculate the OLS estimate of theta (the coefficients) theta_best = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y) Step 5: Make Predictions # Make predictions y_pred = X_b.dot(theta_best) Step 6: Visualize the Results # Plot the data and the regression line plt.scatter(X, y, color='blue', label='Data') plt.pl...