See Normal Equation for small equations of Linear Regression, we can solve it using normal equation method. Consider d dimensional feature and n samples of data. Remember, including the dummy feature, we have a matrix: X \in \mathbb{R}^{n \times \left(d+1\right)} and a target Y \in \mathbb{R}^{n}. Notice:

\begin{equation} J\left(\theta\right) = \frac{1}{2} \sum_{i=1}^{n} \left(h_{\theta} \left(x^{(i)}\right) - y^{(i)}\right)^{2} \end{equation}

and h = X \theta, we we can write:

\begin{equation} J(\theta) = \frac{1}{2} \left(X \theta - y\right)^{T} \left(X \theta - y\right) \end{equation}

We can take a derivative of this Setting this to 0, taking the pseudoinverse:

\begin{equation} \theta = \left(X^{T}X\right)^{-1} X^{T}y \end{equation}
[[curator]]
I'm the Curator. I can help you navigate, organize, and curate this wiki. What would you like to do?