Thursday, October 20, 2016

Stock price clustering analysis

The given dataset contains the closing stock prices for S&P500 stocks for a period of time. A few stocks and prices are shown below:


Their symbols show on the column headers. The companies operate in 10 sectors as follows:

Health Care
Financials
Information Technology
Industrials
Utilities
Materials
Consumer Staples
Consumer Discretionary
Energy
Telecommunications Services

In the pre-processing step, a new data set is created to indicate if the stock prices increase compared with the previous day (1 or 0 corresponding to UP or DOWN). The matrix is then transposed such that the up/down movement of a stock is in in a row. The model will cluster rows/points in a number of clusters. Here the number of clusters is chosen to be 10 to see if the stocks (or most of) of companies operating in the same sectors happen to be grouped together.
The km function implements the K-means algorithm. The outer loop loops for a number of max iterations. The first inner loop assigns each example/point to a cluster. The 2nd loop re-computes the centroids of the clusters.

function [idx,centroids] = km(X,K,max_iters)
[m n] = size(X);
centroids = rand(K,n);
idx = zeros(m,1);
old_idx = zeros(m,1);

for j=1:max_iters
    change = false;
    for i=1:m
        [minval,idx(i)] = min(sum((repmat(X(i,:),K,1)-centroids).^2,2));
        if ((idx(i)~=old_idx(i)) && (change == false))
            change = true;
        end
    end
    for i=1:K
       centroids(i,:) = mean(X(idx==i,:),1); 
    end
    
    if (change == false)  
       break;
    end

    old_idx = idx;
 end

And the main script:

clear all; close all; clc;
[data,symbols,raw] = xlsread('sp500_short_period.xlsx','sp500');
movement = double((data(2:end,:)-data(1:end-1,:))>0)';
[m,n] = size(movement);

K = 10; % 10 sectors
max_iters = 100;

[idx,centroids] = km(movement,K,max_iters);

for i=1:K
    fprintf('\nStocks in group %d moving up together\n',i);
    char(symbols(idx == i))
end

An example for cluster 6: All the stock symbols are in Financial industry:
AIV
AVB
BXP
EQR
HCP
HCN
KIM
PCL
PLD
PSA
SPG
VTR
VNO