You are on page 1of 11

Assignment No B-07

Aim
Develop a book recommender Expert system.

Pre-requisite
1. Machine Learning & Data Mining.
2. Collaborative Filtering.
3. Programming language basics.

Objective
1. To understand idea of recommender system..
2. To develop a book recommender expert system..

Problem Statement
Develop a book recommender Expert system.

Hardware / Software Used


1. Python

Mathematical Model
M = { s, e, X, Y, Fme, DD, NDD, success, failure, CPU CoreCount }

1. s = Initial state - Open dataset files and read its contents.


2. e = End state - Book recommended.
3. X = Input - User ID and dataset files.
4. Y = Output - Book recommendation for given user id based on search his profile.
5. Fme = Calculate which book to recommend using distance formula (Euclidean or
Manhattan).
6. DD = User ID, dataset.
7. NDD = Book rating based on user search history.
8. Success = Book recommended.
9. Failure = User id out of range.
10. CPU CoreCount = 1

Theory
Recommended System:
Recommender Systems (RSs) are software tools and techniques providing suggestions
for items to be of use to a user. In this introductory chapter we briefly discuss basic RS ideas
and concepts. Our main goal is to delineate, in a coherent and structured way, the chapters
included in this handbook and to help the reader navigate the extremely rich and detailed
content that the handbook offers. The suggestions relate to various decision-making processes,
such as what items to buy, what music to listen to, or what online news to read.
Item is the general term used to denote what the system recommends to users. A RS
normally focuses on a specific type of item (e.g., CDs, or news) and accordingly its design, its
graphical user interface, and the core recommendation technique used to generate the recommendations are all customized to provide useful and effective suggestions for that specific type
of item
In order to implement its core function, identifying the useful items for the user, a RS
must predict that an item is worth recommending. In order to do this, the system must be able
to predict the utility of some of them, or at least compare the utility of some items, and then
decide what items to recommend based on this comparison. The prediction step may not be
explicit in the recommendation algorithm but we can still apply this unifying model to describe
the general role of a RS. Here our goals to provide the reader with a unifying perspective rather
than an account of all the different recommendation approaches that will be illustrated

1.
2.
3.
4.
5.

Types of recommendation systems:


Personal recommender systems
Collaborative recommender systems
Content-based recommender systems
Knowledge-based recommender systems
Hybrid recommender systems

Procedure
python book recommondation.py

Conclusion
Hence we have successfully implemented book recommendation system.

Program
=================================================
GROUP A
Assignment No : B7
Title : Develop a book recommender Expert system.
Roll No :
Batch : B
Class : BE ( Computer )
=================================================

import csv
import sys
import math
from enum import Enum
class DistanceFormula(Enum):
manhatten_distance = 1
euclidean_distance = 2
minkowski_distance = 3
pearson_approx = 4
class BookRecommender:
__user_data_base = None
__book_data_base = None
__rating_data_base = None
__data_set = None
__user_id = None
__metric = None
__n = None
__k = None
__r = None
def __compute_manhatten_distance(self, rating_1, rating_2):
distance = 0.0
match_found = False
for key in rating_1:
if key in rating_2:
match_found = True
distance += abs(rating_1[key] - rating_2[key])
if match_found == True:
return distance
else:
return float("inf")

def __compute_euclidean_distance(self, rating_1, rating_2):


distance = 0.0
match_found = False
for key in rating_1:
if key in rating_2:
match_found = True
distance += pow(rating_1[key] - rating_2[key], 2)
if match_found == True:
return math.sqrt(distance)
else:
return float("inf")
def __compute_minkowski_distance(self, rating_1, rating_2):
r = self.__r
distance = 0.0
common_ratings = False
for key in rating_1:
if key in rating_2:
distance += pow(abs(rating_1[key] - rating_2[key]), r)
common_ratings = True;
if common_ratings:
return pow(distance, 1/r)
else:
return 0
def __compute_pearson_correlation_approx(self, rating_1, rating_2):
sum_x = 0.0
sum_y = 0.0
sum_xy = 0.0
sum_x2 = 0.0
sum_y2 = 0.0
n = 0
for key in rating_1:
if key in rating_2:
x = rating_1[key]
y = rating_2[key]
sum_x += x
sum_y += y
sum_xy += (x * y)
sum_x2 += (x * x)
sum_y2 += (y * y)
n += 1
if n == 0:
return 0
else:
den = math.sqrt(sum_x2 - ((sum_x**2) / n)) * \
math.sqrt(sum_y2 - ((sum_y**2) / n))
if den == 0:
5

return 0
else:
num = sum_xy - ((sum_x * sum_y) / n)
return num / den
def __compute_nearest_neighbor_list(self, func_dist_measure):
user_rating = self.__data_set[self.__user_id]
nearest_neighbor_list = []
for key in self.__data_set:
if key != self.__user_id:
key_rating = self.__data_set[key]
dist = func_dist_measure(user_rating, key_rating)
nearest_neighbor_list.append((key, dist))
return nearest_neighbor_list
def __get_nearest_neighbor_list(self):
if self.__metric == DistanceFormula.manhatten_distance:
return self.__compute_nearest_neighbor_list( \
self.__compute_manhatten_distance)
elif self.__metric == DistanceFormula.euclidean_distance:
return self.__compute_nearest_neighbor_list( \
self.__compute_euclidean_distance)
elif self.__metric == DistanceFormula.minkowski_distance:
return self.__compute_nearest_neighbor_list( \
self.__compute_minkowski_distance)
elif self.__metric == DistanceFormula.pearson_approx:
return self.__compute_nearest_neighbor_list(
self.__compute_pearson_approx)
else:
print("invalid similarity measure type")
sys.exit()
def __sort_nearest_neighbor_list(self, nearest_neighbor_list):
if self.__metric == DistanceFormula.manhatten_distance or \
self.__metric == DistanceFormula.euclidean_distance or \
self.__metric == DistanceFormula.minkowski_distance:
return sorted(nearest_neighbor_list, key = lambda arg: arg[1])
elif self.__metric == DistanceFormula.pearson_approx:
return sorted(nearest_neighbor_list, key = lambda arg: arg[1], \
reverse = True)
def __get_nearest_ratings(self, nearest_neighbor_list):
nearest_user = nearest_neighbor_list[0][0]
nearest_ratings = self.__data_set[nearest_user]
6

return nearest_ratings
def __get_user_id_ratings(self):
user_id_ratings = self.__data_set[self.__user_id]
return user_id_ratings
def __get_recommendation_list(self, user_id_ratings, nearest_ratings):
recommendations_list = []
for key in nearest_ratings:
if not key in user_id_ratings:
for book in self.__book_data_base:
if (int(book[0]) == key):
recommendations_list.append(book)
return recommendations_list
def __get_rating_dictionary(self, user):
rating_dictionary = {}
for rating in self.__rating_data_base:
if int(user) == int(rating[0]):
rating_dictionary[int(rating[1])] = int(rating[2])
return rating_dictionary
def __construct_data_set(self):
data_set = {}
for user in self.__user_data_base:
rating_dict = self.__get_rating_dictionary(user[0])
data_set[int(user[0])] = rating_dict
return data_set
def recommend(self):
print("...construting data set")
self.__data_set = self.__construct_data_set()
print("...data set construction completed")
print("...computing nearest neighbor list")
nearest_neighbor_list = self.__get_nearest_neighbor_list()
print("...completed computing of nearest neighbor list")
print("...computing recommended list of books")
nearest_neighbor_list = self.__sort_nearest_neighbor_list(
nearest_neighbor_list)
7

nearest_ratings =
self.__get_nearest_ratings(nearest_neighbor_list)
user_id_ratings = self.__get_user_id_ratings()
recommendations_list = self.__get_recommendation_list( \
user_id_ratings, nearest_ratings)
print("...completed computing of recommended list of books")
return recommendations_list
def __init__(self, user_data_base, book_data_base, rating_data_base, \
user_id, metric, n, k, r):
self.__user_data_base = user_data_base
self.__book_data_base = book_data_base
self.__rating_data_base = rating_data_base
self.__user_id = int(user_id)
self.__metric = metric
self.__n = n
self.__k = k
self.__r = r
def get_few_essential_parameters():
metric = DistanceFormula.manhatten_distance
n = 5
k = 3
r = 2
return metric, n, k, r
def get_user_id(user_count):
print ("\n")
print ("Please enter the user-id to whom books need to be recommended.")
print ("The valid range for user-id: [1," + str(user_count) + "]")
print ("\n")
user_id = input()
return user_id
def print_recommended_books(user_id, recommendations_list):
sys.stdout.flush()
if len(recommendations_list) == 0:
print("\n No suitable books to recommend for user_id:",
user_id)
8

else:
print("\n Following books are recommended for user_id:",
user_id, "\n")
for book in recommendations_list:
print("[" + str(book[0]) + ", " + "\"" + str(book[1]) + "\"" + ", " + \
"\"" + str(book[2]) + "\"" + ", " + str(book[3]) + ", " + "\"" + \
str(book[4]) + "\"" + "]")
print("\n")
def read_csv_files(user_file, book_file, rating_file):
user_data_base = list(csv.reader(user_file, delimiter=;))
book_data_base = list(csv.reader(book_file, delimiter=;))
rating_data_base = list(csv.reader(rating_file, delimiter=;))
return user_data_base, book_data_base, rating_data_base
def open_csv_files():
user_file = open("BX-Users.csv", r)
book_file = open("BX-Books.csv", r)
rating_file = open("BX-Book-Ratings.csv", r)
return user_file, book_file, rating_file
def close_csv_files(user_file, book_file, rating_file):
user_file.close()
book_file.close()
rating_file.close()
def clear_screen():
print("\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n \
\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n")
def simple_book_recommendation_system():
user_file, book_file, rating_file = open_csv_files()
user_data_base, book_data_base, rating_data_base = \
read_csv_files(user_file, book_file, rating_file)
clear_screen()
user_id = get_user_id(len(user_data_base))
metric, n, k, r = get_few_essential_parameters()
book_recommender = BookRecommender(user_data_base, book_data_base, \
9

rating_data_base, user_id, metric, n, k, r)


recommendations_list = book_recommender.recommend()
clear_screen()
print_recommended_books(user_id, recommendations_list)
close_csv_files(user_file, book_file, rating_file)
simple_book_recommendation_system()

Output
administrator@siftworkstation:~$ cd Desktop/
administrator@siftworkstation:~/Desktop$ python book_recommender.py
Please enter the user-id to whom books need to be recommended.
The valid range for user-id: [1,50]
20
...construting data set
...data set construction completed
...computing nearest neighbor list
...completed computing of nearest neighbor list
...computing recommended list of books
...completed computing of recommended list of books
(\nFollowing books are recommended for user_id:, 20, \n)
[0000000032, "Wie Barney es sieht.", "Mordecai Richler", 2002, "L??bbe"]

administrator@siftworkstation:~/Desktop$

10

Plagiarism Score

11

You might also like