Thursday, October 20, 2016

Loan recommendation system using Collaborative Filtering

You are modeling a recommender system for a lending club platform using model-based Collaborative Filtering (vs memory-based approach computing similarities between users and items to give recommendations using rating data).

The model-based CF is a latent factor model, more robust than the memory-based approach, and handles sparsity better. Consider a sparse rating matrix of which the elements are ratings given by lender j to loan i. The rating matrix is modeled by a matrix product of X (loan-feature matrix) and Ө (user-feature matrix) (see the figure). Each rating given by lender j to loan i is an inner product of row i in X and column j in Ө as illustrated below:



The recommender system model helps estimate the missing ratings and recommends good loans to the lenders. The function costFunction() computes the regularized cost function as follows:
In which the last 2 terms are the regularization terms. And the gradient of the cost function:

In the code of costFunction function, X and q are extracted from the parameter vector namely params. X has features/variables on its columns, and Ө has the lenders’ preference on it columns. 


function [J,grad] = costFunction(param,Y,r,n_lenders,n_loans,n_features,lambda)
% Extract X and Theta from param vector
X = reshape(param(1:n_loans*n_features),n_loans,n_features);
Theta = reshape(param(n_loans*n_features+1:end),n_lenders,n_features);
            
% Cost
predictions = X*Theta'; % prediction,nm x nu
errors = (predictions-Y).*r; % also nm x nu
J = (1/2)*sum(sum(errors.^2));

% Gradients
X_grad = errors*Theta; % error is  nm x nu,and Theta is nu x n,X_grad is nm x n
Theta_grad = errors'*X; % error' is  nu x nm,X is nm x n,so Theta_grad is nu x n

% Regularized cost function to penalize overfitting
reg_X = (lambda/2)*sum(sum(X.^2));
reg_Theta = (lambda/2)*sum(sum(Theta.^2));
J = J+reg_Theta+reg_X;

% Add regularization terms to gradients
X_grad = X_grad+lambda*X;
Theta_grad = Theta_grad+lambda*Theta;

grad = [X_grad(:); Theta_grad(:)];
end

The optimizeCost function searches for the optimal parameter vector using gradient descent:

function [param,cost_range] = optimizeCost(param,Y,r,n_lenders,n_loans,n_features,lambda,step,maxrun)
    cost_range = zeros(maxrun,1);
    for iter = 1:maxrun
        [J,grad] = costFunction(param,Y,r,n_lenders,n_loans,n_features,lambda);
        param = param - step * grad; % Gradient descent for both X and Theta
        cost_range(iter) = J;
    end
end


In the data there are 10 lenders and a large number of loans, in which many were rated by the lenders with ratings from 1-10 (in practice there are many lenders, who invest). The ones that are not rated yet have ratings of 0.


The main script:

clear all; close all; clc;
[Y,txt,raw1] = xlsread('loandata.xlsx','loanrating','A1:J101');
[num,text,raw2] = xlsread('loandata.xlsx','loan','A1:J101');
R = (Y~=0);

n_lenders = size(Y,2);
n_loans = size(Y,1);
n_features = 10;

% Initialization
X = randn(n_loans,n_features);
Theta = randn(n_lenders,n_features);
init_param = [X(:); Theta(:)];

% Optimization
lambda = 10;
maxrun = 1000; % maximum number of iterations
step = 0.001; 
[param,cost_range] = optimizeCost(init_param,Y,R,n_lenders,n_loans,n_features,lambda,step,maxrun);

% Extract X and Theta from the variable theta
X = reshape(param(1:n_loans*n_features),n_loans,n_features);
Theta = reshape(param(n_loans*n_features+1:end),n_lenders,n_features);
pred = X * Theta';