Skip to main content

Linear Regression using Ordinary Least Square method

Ordinary Least Square Method

Download Dataset

Step 1: Import the necessary libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Step 2: Load the CSV Data

# Load the dataset
data = pd.read_csv('house_data.csv')

# Extract the features (X) and target variable (y)
X = data['Size'].values
y = data['Price'].values

# Reshape X to be a 2D array
X = X.reshape(-1, 1)

# Add a column of ones to X for the intercept
X_b = np.c_[np.ones((X.shape[0], 1)), X]

Step 3: Add a Column of Ones to X for the Intercept
# Add a column of ones to X for the intercept
X_b = np.c_[np.ones((X.shape[0], 1)), X]

Step 4: Implement the OLS Method
# Calculate the OLS estimate of theta (the coefficients)
theta_best = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y)

Step 5: Make Predictions
# Make predictions
y_pred = X_b.dot(theta_best)

Step 6: Visualize the Results
# Plot the data and the regression line
plt.scatter(X, y, color='blue', label='Data')
plt.plot(X, y_pred, color='red', label='Regression Line')
plt.xlabel('Size (Square Feet)')
plt.ylabel('Price (Dollars)')
plt.legend()
plt.show()







 

Comments

Popular posts from this blog

Logistic Regression

Logistic regression is a statistical method used for binary classification problems. It's particularly useful when you need to predict the probability of a binary outcome based on one or more predictor variables. Here's a breakdown: What is Logistic Regression? Purpose : It models the probability of a binary outcome (e.g., yes/no, success/failure) using a logistic function (sigmoid function). Function : The logistic function maps predicted values (which are in a range from negative infinity to positive infinity) to a probability range between 0 and 1. Formula : The model is typically expressed as: P ( Y = 1 ∣ X ) = 1 1 + e − ( β 0 + β 1 X ) P(Y = 1 | X) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 X)}} P ( Y = 1∣ X ) = 1 + e − ( β 0 ​ + β 1 ​ X ) 1 ​ Where P ( Y = 1 ∣ X ) P(Y = 1 | X) P ( Y = 1∣ X ) is the probability of the outcome being 1 given predictor X X X , and β 0 \beta_0 β 0 ​ and β 1 \beta_1 β 1 ​ are coefficients estimated during model training. When to Apply Logistic R...

Quadratic Regression

  Quadratic regression is a statistical method used to model a relationship between variables with a parabolic best-fit curve, rather than a straight line. It's ideal when the data relationship appears curvilinear. The goal is to fit a quadratic equation   y=ax^2+bx+c y = a ⁢ x 2 + b ⁢ x + c to the observed data, providing a nuanced model of the relationship. Contrary to historical or biological connotations, "regression" in this mathematical context refers to advancing our understanding of complex relationships among variables, particularly when data follows a curvilinear pattern. Working with quadratic regression These calculations can become quite complex and tedious. We have just gone over a few very detailed formulas, but the truth is that we can handle these calculations with a graphing calculator. This saves us from having to go through so many steps -- but we still must understand the core concepts at play. Let's try a practice problem that includes quadratic ...