The CAPM model:
$$R(k,i) = a(i) + C(k) + b(i) * (M(k) - C(k)) + V(k,i)$$
for samples k = 1, ... , m and assets i = 1, ... , n, where a(i) is a parameter that specifies the non-systematic return of an asset, b(i) is the asset beta, and V(k,i) is the residual error for each asset with associated random variable V(i). Asset alphas a(1), ... , a(n) are zeros in strict form of CAPM but non-zeros in practice.
The MATLAB dataset CAPMuniverse contains the daily total return data from 03-Jan-2000 to 07-Nov-2005 for 12 stocks as follows: 'AAPL', 'AMZN', 'CSCO', 'DELL', 'EBAY', 'GOOG', 'HPQ', 'IBM', 'INTC', 'MSFT', 'ORCL', 'YHOO'. Columns 13 and 14 are daily return data for the market, and the risk-free rate. For computing beta of each stock, You will subtract risk-free rate from the stock and the market returns to get x and y. Note that you need to add a column of 1 to x to make X so that X is of size m x 2
$$h_\theta(x)=\theta^{T}X=\theta_0+\theta_1x$$
More information regarding the dataset can be seen on Mathworks website. You will use regression to find the betas of these securities.
To compute the cost
$$J(\theta_0,\theta_1)=\frac{1}{2m}\sum\limits_{i=1}^m (h_\theta(x^{(i)})-y^{(i)}))^2$$
The function computeCost() as as follows:
function cost = computeCost(X,y,theta)
m = length(y);
h = X*theta;
cost = 1/2/m*sum((h-y).^2);
end
The parameters θ are updated with the pseudo-code:
Repeat until the maximum number of iterations {
Do the following for all thetas:
$$θ_j≔ θ_j-α/m \sum\limits_{i=1}^m (h_θ (x^{(i)}-y^{(i)})(x_j)^{(i)}$$
where j=0,…,n
}
The cost can be computed within the loop as well.
You need to update all the parameters $θ_j$ simultaneously. The function optimizeCost() output variables, function name, and input parameters
function [theta,cost_range] = optimizeCost(X,y,theta,step,maxrun)
m = length(y);
cost_range = zeros(maxrun,1);
for iter = 1:maxrun
h = X*theta;
grad = 1/m * (h-y)' * X; % grad is 1 x d
theta = theta - step * grad';
cost_range(iter) = 1/2/m*sum((h-y).^2);
end
end
Plotting the regression line:
For example the column 12 is the return data for Yahoo (YHOO), θ_1and θ_2 are
theta =
0.0001
1.6543
Where θ_1and θ_2 are alpha and beta values of YHOO computed by regression
Plotting the cost vs the number of iterations shows that the cost function does not increase as the number of iterations increases

