Skip to main content

Linear & Multilinear Regression

Linear regression is one of the most basic and commonly used statistical techniques in machine learning. The link between a dependent variable (target) and one or more independent variables (features) is modeled using this technique. The objective is to find the optimal linear equation that fits the data and can be used to forecast the dependent variable using the values of the independent variables.

Simple Linear Regression 

The one independent variable (feature) and the single dependent variable (target) in basic linear regression. A straight line is used to depict their relationship.

It involves one independent variable: y=β0+β1×x+ϵ 

  • y is the dependent variable (target).
  • x is the independent variable (feature).
  • β0 is the y-intercept (constant term).
  • β1 is the slope of the line (coefficient for the feature).
  • ϵ is the error term (residual).
Multiple Linear Regression

In multiple linear regression, there are numerous independent variables (features). A hyperplane in higher dimensions represents the relationship between the dependent variable and the independent variables.
It involves multiple independent variables: y=β0+β1×x1+β2×x2++βn×xn+ϵy = \beta_0 + \beta_1 \times x_1 + \beta_2 \times x_2 + \dots + \beta_n \times x_n + \epsilon

Working Process of Linear Regression

  1. Hypothesis Representation:

    • The relationship between the dependent and independent variables is represented as a linear equation. The objective is to find the best-fitting line (in simple linear regression) or hyperplane (in multiple linear regression).
  2. Cost Function (Mean Squared Error):

    • The cost function computes the difference between the expected and actual values. The most frequent cost function for linear regression is Mean Squared Error. (MSE): J(β0,β1)=12mi=1m(y^(i)y(i))2J(\beta_0, \beta_1) = \frac{1}{2m} \sum_{i=1}^{m} \left( \hat{y}^{(i)} - y^{(i)} \right)^2where y^(i)\hat{y}^{(i)} is the predicted value, y(i)y^{(i)} is the actual value, and m is the number of training examples.
  3. Optimization (Gradient Descent):

    • The goal is to minimize the cost function by adjusting the parameters β0\beta_0 and β1\beta_1. Gradient Descent is a common optimization technique used to find the minimum of the cost function: βj:=βjαβjJ(β0,β1) where α\alpha is the learning rate, and βjJ(β0,β1)\frac{\partial}{\partial \beta_j} J(\beta_0, \beta_1) is the partial derivative of the cost function for βj\beta_j.
  4. Model Evaluation:

    • The performance of the linear regression model can be evaluated using metrics such as R-squared (R2R^2), Mean Absolute Error (MAE), or Root Mean Squared Error (RMSE).
Implementation of Simple Linear Graph:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# Sample data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([1, 3, 2, 3, 5])

# Creating and training the model
model = LinearRegression()
model.fit(X, y)

# Predicting
y_pred = model.predict(X)

# Plotting the data points and the regression line
plt.figure(figsize=(8, 6))
plt.scatter(X, y, color='blue', label='Data Points', s=100)  # Larger dots for clarity
plt.plot(X, y_pred, color='red', linewidth=2, label='Regression Line')  # Thicker line for better visibility
plt.xlabel('X', fontsize=14)
plt.ylabel('y', fontsize=14)
plt.title('Simple Linear Regression', fontsize=16)
plt.legend()
plt.grid(True)  # Adding grid for better reference
plt.show()



Implementation of Multi-Linear Regression:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sklearn.linear_model import LinearRegression

# Sample data
X = np.array([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]])
y = np.array([2, 3, 4, 5, 6])

# Creating and training the model
model = LinearRegression()
model.fit(X, y)

# Predicting
y_pred = model.predict(X)

# Plotting the data points and the regression plane
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(X[:, 0], X[:, 1], y, color='blue', label='Data Points')

# Creating the plane
x_surf, y_surf = np.meshgrid(np.linspace(X[:, 0].min(), X[:, 0].max(), 100),
                             np.linspace(X[:, 1].min(), X[:, 1].max(), 100))
z_surf = model.predict(np.c_[x_surf.ravel(), y_surf.ravel()]).reshape(x_surf.shape)

# Plotting the plane
ax.plot_surface(x_surf, y_surf, z_surf, color='red', alpha=0.5)

ax.set_xlabel('X1')
ax.set_ylabel('X2')
ax.set_zlabel('y')
ax.set_title('Multiple Linear Regression')
plt.show()



Comments

Popular posts from this blog

ML Lab Questions

1. Using matplotlib and seaborn to perform data visualization on the standard dataset a. Perform the preprocessing b. Print the no of rows and columns c. Plot box plot d. Heat map e. Scatter plot f. Bubble chart g. Area chart 2. Build a Linear Regression model using Gradient Descent methods in Python for a wine data set 3. Build a Linear Regression model using an ordinary least-squared model in Python for a wine data set  4. Implement quadratic Regression for the wine dataset 5. Implement Logistic Regression for the wine data set 6. Implement classification using SVM for Iris Dataset 7. Implement Decision-tree learning for the Tip Dataset 8. Implement Bagging using Random Forests  9.  Implement K-means Clustering    10.  Implement DBSCAN clustering  11.  Implement the Gaussian Mixture Model  12. Solve the curse of Dimensionality by implementing the PCA algorithm on a high-dimensional 13. Comparison of Classification algorithms  14. Compa...

DBSCAN

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a popular density-based clustering algorithm that groups data points based on their density in feature space. It’s beneficial for datasets with clusters of varying shapes, sizes, and densities, and can identify noise or outliers. Step 1: Initialize Parameters Define two important parameters: Epsilon (ε) : The maximum distance between two points for them to be considered neighbors. Minimum Points (minPts) : The minimum number of points required in an ε-radius neighborhood for a point to be considered a core point. Step 2: Label Each Point as Core, Border, or Noise For each data point P P P in the dataset: Find all points within the ε radius of P P P (the ε-neighborhood of P P P ). Core Point : If P P P has at least minPts points within its ε-neighborhood, it’s marked as a core point. Border Point : If P P P has fewer than minPts points in its ε-neighborhood but is within the ε-neighborhood of a core point, it’...

Gaussian Mixture Model

A Gaussian Mixture Model (GMM) is a probabilistic model used for clustering and density estimation. It assumes that data is generated from a mixture of several Gaussian distributions, each representing a cluster within the dataset. Unlike K-means, which assigns data points to the nearest cluster centroid deterministically, GMM considers each data point as belonging to each cluster with a certain probability, allowing for soft clustering. GMM is ideal when: Clusters have elliptical shapes or different spreads : GMM captures varying shapes and densities, unlike K-means, which assumes clusters are spherical. Soft clustering is preferred : If you want to know the probability of a data point belonging to each cluster (not a hard assignment). Data has overlapping clusters : GMM allows a point to belong partially to multiple clusters, which is helpful when clusters have significant overlap. Applications of GMM Image Segmentation : Used to segment images into regions, where each region can be...