Welcome to Scribd!

KNN ALGORITHM IN MACHINELEARNING

Uploaded by

0% found this document useful (0 votes)

72 views10 pages

K-Nearest Neighbors is a lazy learning algorithm that classifies new data points based on the majority class of its k nearest neighbors. It requires storing all training data and calculating distances between new and stored points, making it computationally expensive for large datasets. Preprocessing techniques like dimensionality reduction and attribute weighting can help address the "curse of dimensionality" issue KNN faces with high-dimensional data.

Original Description:

KNN ALGORITHM IN MACHINE LEARNING

Original Title

Knn ALGORITHM IN MACHINELEARNING

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

72 views10 pages

KNN ALGORITHM IN MACHINELEARNING

Uploaded by

nithinmamidala999

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 10

Search inside document

K – Nearest Neighbors

Day 4
Introduction
At its core, the algorithm says:

• Pick a number of neighbors you want to use for

classification or regression
• Choose a method to measure distances
• Keep a data set with records
• For every new point, identify the number of nearest
neighbors you picked using the method you chose
• Let them vote if it is a classification issue or take a
mean/median for regression
Diagrammatically
• Let them vote if it is a
classification issue or take a
mean/median for regression
• Here, N = 1. The new green point is labeled black as
its nearest neighbor is black too

• Here, N = 3. The new green point

is labeled white based on the
voting of three nearest neighbors
From the algorithm, clearly KNN is

• Lazy: This is a technical term! All the techniques

we learned so far have a phase called “training
phase” and try to identify a function from the
training set. Then apply this function to the test
data. Such learning is called “eager learning”. K-
NN on the other hand does not generalize and
uses all the training data (or a subset) in the
testing phase. This type of learning is called lazy
learning or instance based learning.
• K-NN requires more time, as all data points
are needed to decide.
• It requires more memory as all training data
needs to be stored. So, it is very expensive for
large data sets and large dimensions. Where
N is the number of training examples, d is the
dimension of each sample.
• Hence, a lot of time must be spent in reducing
the N and d. K-NN does suffer from the curse
of dimensionality.
Attributes
Handling curse of dimensionality
• K-NN is heavily impacted by huge number of
dimensions
• Reduce the dimensions using
– Correlation , Principal Component Analysis
– Gain Ratio, Info gain (filter approach: We lose
some that are important)
– Wrapper methods (Forward selection,
Backward elimination)
– Weighting attributes
• Scaling the attributes
– Attributes with larger range can dominate
To understand this, consider the following pair of data
points (0.1, 20) and (0.9, 720)
• The distance is almost completely dominated by (720-
20) = 700. To avoid this, we standardize attributes to
force the attributes to have a common value range.
The common techniques include
• Taking logarithms when one variable is varying several
orders of magnitude
• Dividing with the highest value to get the variables
between 0 and 1
• Standardizing will bring most of the data to -3 and 3
• Categorical variables and Ordinal variables need to
be converted to numeric
• Handling missing values
– K-NN is impacted heavily by missing values
– Imputation is one option
• Handling overfitting
– Remove outliers (Wilson Editing)
• Speeding up KNN
– Condensation
Feature Engineering
• Library (class)
• kNN produces complex decision surfaces.
• As complexity of decision surfaces increases,
accuracy decreases and we need more data
• Increase K to decrease the over-fit.
• kNN gives no explicability.
• kNN is a distance method, only numeric
variables. Convert categorical/ordinal
values into numerical
• kNN works well in batch mode not in real
time
• kNN fails when there are missing values.
(use kNN Imputation in data pre-
processing to fill missing values)
• In kNN, training is easy but predictions are
difficult

The Coin Changing Problem The Coin Changing Problem
Document17 pages
The Coin Changing Problem The Coin Changing Problem
Craig M
No ratings yet
Final Project Report
Document18 pages
Final Project Report
jstpallav
No ratings yet
Quiz 1
Document5 pages
Quiz 1
jam1976
No ratings yet
Design and Implementation of Electronic Billing System
Document64 pages
Design and Implementation of Electronic Billing System
Damilola Adegbemile
No ratings yet
DIGI-Net: A Deep Convolutional Neural Network For Multi-Format Digit Recognition
Document11 pages
DIGI-Net: A Deep Convolutional Neural Network For Multi-Format Digit Recognition
Huseyin Kusetogullari
No ratings yet
Chapter 6 - Fact-Finding Techniques
Document28 pages
Chapter 6 - Fact-Finding Techniques
morohen
No ratings yet
Datawarehouse Quiz
Document5 pages
Datawarehouse Quiz
Saagar Mukhopadhyay
No ratings yet
Who Is The Father of Artificial Intelligence?
Document74 pages
Who Is The Father of Artificial Intelligence?
Deepshikha Mehta
No ratings yet
Week 8 Activity
Document1 page
Week 8 Activity
Danny Manno
No ratings yet
Lab 10.3 - Configure IPv6 Addressing
Document3 pages
Lab 10.3 - Configure IPv6 Addressing
Recky Jimmy
No ratings yet
Computer Graphics Solved Mcqs Set 24
Document6 pages
Computer Graphics Solved Mcqs Set 24
MS Waseem
No ratings yet
MCQ CH1
Document7 pages
MCQ CH1
Mahmoud Elmahdy
No ratings yet
MCQ On Unit-1
Document13 pages
MCQ On Unit-1
chirag shinde
No ratings yet
Intro - Studyguide - Final Exam
Document4 pages
Intro - Studyguide - Final Exam
Edgar
No ratings yet
KNN Algorithm
Document10 pages
KNN Algorithm
Anandhi Anu
No ratings yet
Image Compression Using DCT
Document10 pages
Image Compression Using DCT
Sivaranjan Goswami
100% (1)
Bird Species Identification Using Deep Learning
Document74 pages
Bird Species Identification Using Deep Learning
aslan
No ratings yet
Failure Recovery in Distributed Systems PDF
Document2 pages
Failure Recovery in Distributed Systems PDF
Gregory
No ratings yet
Instructions:: Q1. Answer The Following Questions: (Marks 10)
Document3 pages
Instructions:: Q1. Answer The Following Questions: (Marks 10)
Zoha Mobin
No ratings yet
Human-Computer Interaction: - Dr. Muhammad Raza - Assistant Professor
Document25 pages
Human-Computer Interaction: - Dr. Muhammad Raza - Assistant Professor
ansa hj
No ratings yet
Module 2 Notes v1 PDF
Document20 pages
Module 2 Notes v1 PDF
surya
No ratings yet
Data Mining - Density Based Clustering
Document8 pages
Data Mining - Density Based Clustering
Raj Endran
No ratings yet
COA Chapter 1 Notes
Document14 pages
COA Chapter 1 Notes
Amitesh ki class For engineering
No ratings yet
Combined Bank: Cloud IT Solution MCQ
Document23 pages
Combined Bank: Cloud IT Solution MCQ
Sakib Muhaimin
No ratings yet
Chap UML - 1
Document97 pages
Chap UML - 1
Cheyma Sg'h
No ratings yet
Ppt6-It Project Cost Management-R0
Document40 pages
Ppt6-It Project Cost Management-R0
Creative Preneur
No ratings yet
Assignment Distributed Database System
Document6 pages
Assignment Distributed Database System
Abdul Rehman Buttsarkar
No ratings yet
Computer Graphics (CSC209)
Document6 pages
Computer Graphics (CSC209)
Bishal Shahi
No ratings yet
Fundamentals of Computer: Prof. Jhalak Dutta
Document55 pages
Fundamentals of Computer: Prof. Jhalak Dutta
Jhalak Dutta
No ratings yet
Vaughan Chapter07
Document82 pages
Vaughan Chapter07
Azim Syahmi
No ratings yet
Database Applications and Privacy Implications: Multiple Choice: 1
Document15 pages
Database Applications and Privacy Implications: Multiple Choice: 1
Faiz L
No ratings yet
Bigdataaaaa
Document180 pages
Bigdataaaaa
Aya Grami
No ratings yet
CS3451 Os
Document2 pages
CS3451 Os
Anurekha Prasad
No ratings yet
Adbms-Practice Questions: (7 Marks)
Document9 pages
Adbms-Practice Questions: (7 Marks)
Hyder Lapnupi
No ratings yet
Quiz 4 - Data Preparation
Document2 pages
Quiz 4 - Data Preparation
Mr.Padmanaban V
100% (1)
11 OceanConnect IoT Platform
Document43 pages
11 OceanConnect IoT Platform
Nada Knani
No ratings yet
6.unit I 2 Marks
Document3 pages
6.unit I 2 Marks
prasanthprp
No ratings yet
Machine Learning Unit 2 MCQ
Document17 pages
Machine Learning Unit 2 MCQ
Harsh Preet Singh
No ratings yet
WINSEM2022-23 MEE3502 ETH VL2022230500781 2022-12-21 Reference-Material-I
Document24 pages
WINSEM2022-23 MEE3502 ETH VL2022230500781 2022-12-21 Reference-Material-I
wewewew
No ratings yet
Calculate Confusion Matrices
Document5 pages
Calculate Confusion Matrices
Ricardo Garcia
No ratings yet
Chapter 13 System Implementation and Maintenance (ForBCA)
Document28 pages
Chapter 13 System Implementation and Maintenance (ForBCA)
Harpreet Rai
No ratings yet
ExtendSim Seminar
Document51 pages
ExtendSim Seminar
Eddie Myles
No ratings yet
Project PDF
Document148 pages
Project PDF
Shiva M
No ratings yet
Cuckoo Search (CS) Algorithm - File Exchange - MATLAB Central
Document5 pages
Cuckoo Search (CS) Algorithm - File Exchange - MATLAB Central
Raja Sekhar Batchu
No ratings yet
Multiple Choice Questions
Document64 pages
Multiple Choice Questions
patience
No ratings yet
LabVIEW PPT Presentation
Document27 pages
LabVIEW PPT Presentation
Ranadeep Dey
No ratings yet
Os MCQ
Document7 pages
Os MCQ
makreal
100% (1)
Contrast Stretching
Document9 pages
Contrast Stretching
Kaustav Mitra
No ratings yet
Data Mining Final Exam
Document1 page
Data Mining Final Exam
Steph
No ratings yet
Cloudsim
Document15 pages
Cloudsim
Imran Irfan
No ratings yet
2 Convolutional Neural Network For Image Classification
Document6 pages
2 Convolutional Neural Network For Image Classification
Kompruch Benjaputharak
No ratings yet
Understanding of A Convolutional Neural Network
Document6 pages
Understanding of A Convolutional Neural Network
Javier Bush
No ratings yet
Chapter 1 - Introduction To Software Engineering
Document33 pages
Chapter 1 - Introduction To Software Engineering
dahiyah faridgh
No ratings yet
4.discrete Event Simulation 2
Document54 pages
4.discrete Event Simulation 2
AdiSatriaPangestu
No ratings yet
2marks and 16 Marks
Document63 pages
2marks and 16 Marks
akshaya
No ratings yet
Solutions Image Processing Ulaby Yagle
Document110 pages
Solutions Image Processing Ulaby Yagle
Irshadh Ibrahim
0% (1)
OS Multiple Choice Questions
Document6 pages
OS Multiple Choice Questions
Sheena Mohammed
No ratings yet
Decision Tree - GeeksforGeeks
Document4 pages
Decision Tree - GeeksforGeeks
Gaurang singh
No ratings yet
ML (Interview)
Document20 pages
ML (Interview)
ratnadepp
No ratings yet
Training Machine Learning KNN 2017
Document17 pages
Training Machine Learning KNN 2017
Iwan
No ratings yet
101 Keyboard Shortcuts
Document6 pages
101 Keyboard Shortcuts
little cute gurlz
100% (4)
Excel Analysis of Between Companies
Document5 pages
Excel Analysis of Between Companies
nithinmamidala999
No ratings yet
Are
Document158 pages
Are
Aditya Kumar Singh
No ratings yet
Summary Measures: Multiple Choice Questions
Document9 pages
Summary Measures: Multiple Choice Questions
nithinmamidala999
No ratings yet
Bankigsites
Document1 page
Bankigsites
nithinmamidala999
No ratings yet
Analysis With Missing Data
Document55 pages
Analysis With Missing Data
nithinmamidala999
No ratings yet
Bharathi.A: E-Mail
Document3 pages
Bharathi.A: E-Mail
nithinmamidala999
No ratings yet
DataStage Stages 12-Dec-2013 12PM
Document47 pages
DataStage Stages 12-Dec-2013 12PM
nithinmamidala999
No ratings yet
DS Faqs
Document16 pages
DS Faqs
Srikanth Reddy
No ratings yet
Data Stage Interview Questions - 12-Dec-2013 - 12PM
Document4 pages
Data Stage Interview Questions - 12-Dec-2013 - 12PM
nithinmamidala999
No ratings yet
Datastage Architecture
Document4 pages
Datastage Architecture
nithinmamidala999
No ratings yet
DS Admin Cmds
Document121 pages
DS Admin Cmds
nithinmamidala999
No ratings yet
What Is The Flow of Loading Data Into Fact & Dimensional Tables?
Document3 pages
What Is The Flow of Loading Data Into Fact & Dimensional Tables?
nithinmamidala999
No ratings yet
Amulya DataStag Resume
Document4 pages
Amulya DataStag Resume
nithinmamidala999
No ratings yet
Proc Tabulate: Doing More: Art Carpenter California Occidental Consultants, Anchorage, AK
Document18 pages
Proc Tabulate: Doing More: Art Carpenter California Occidental Consultants, Anchorage, AK
nithinmamidala999
No ratings yet
Battula Edukondalu: Good Experience in
Document3 pages
Battula Edukondalu: Good Experience in
nithinmamidala999
No ratings yet
Email: Mobile No: Professional Summary
Document3 pages
Email: Mobile No: Professional Summary
nithinmamidala999
No ratings yet
Basic of Statistics
Document83 pages
Basic of Statistics
Muhammad Syiardy
No ratings yet
170-2008 Learning Procreport PDF
Document15 pages
170-2008 Learning Procreport PDF
nithinmamidala999
No ratings yet
Sandeep ds3 2014-04-22
Document3 pages
Sandeep ds3 2014-04-22
nithinmamidala999
No ratings yet
Analytics With R - Course Contents
Document2 pages
Analytics With R - Course Contents
Harikrishna Bajjuri
No ratings yet
Major Components of Teradata Architecture
Document52 pages
Major Components of Teradata Architecture
nithinmamidala999
No ratings yet
Debbie Hoppe, John Alden Life Insurance Company, Sacramento, CA
Document2 pages
Debbie Hoppe, John Alden Life Insurance Company, Sacramento, CA
nithinmamidala999
No ratings yet
Hanumanth 3+ Testing Resume
Document3 pages
Hanumanth 3+ Testing Resume
nithinmamidala999
No ratings yet
DataStage XML and Web Services Packs Overview
Document71 pages
DataStage XML and Web Services Packs Overview
nithinmamidala999
No ratings yet
Datastage Errors and Resolution
Document10 pages
Datastage Errors and Resolution
nithinmamidala999
No ratings yet
Aggregator
Document4 pages
Aggregator
nithinmamidala999
No ratings yet
Datastage Cantents
Document6 pages
Datastage Cantents
nithinmamidala999
No ratings yet
Daywise Schedule DATASTAGE 8.7
Document3 pages
Daywise Schedule DATASTAGE 8.7
nithinmamidala999
No ratings yet
0307
Document7 pages
0307
utarinu
No ratings yet
Bangladesh University of Professionals (BUP) : Afrin Sadia Rumana Faculty of Business Studies (FBS)
Document20 pages
Bangladesh University of Professionals (BUP) : Afrin Sadia Rumana Faculty of Business Studies (FBS)
Tanvir Ahmed
No ratings yet
MDFL 07 - 5TH - Business - Coar Junin PDF
Document10 pages
MDFL 07 - 5TH - Business - Coar Junin PDF
DELGADO ZEVALLOS ANGIE MELISSA
No ratings yet
H - WFishy Frequencies Fill in
Document3 pages
H - WFishy Frequencies Fill in
okiedokie1234
No ratings yet
A Case Study
Document15 pages
A Case Study
Sammah Obaji Ori
100% (1)
Autoregressive Integrated Moving Average Arima
Document23 pages
Autoregressive Integrated Moving Average Arima
api-285777244
No ratings yet
A Vision-Based Approach For Deep Web Data
Document14 pages
A Vision-Based Approach For Deep Web Data
Rajbabu Kumaravel
No ratings yet
Contractor Selection at Prequalification Stage Cur
Document10 pages
Contractor Selection at Prequalification Stage Cur
Shranik Jain
No ratings yet
Department of Mathematics: Faculty of Basic Sciences Probability and Statistics Exercises
Document8 pages
Department of Mathematics: Faculty of Basic Sciences Probability and Statistics Exercises
Ly Phan
No ratings yet
Tesda Entrepreneurship Trainings PDF
Document11 pages
Tesda Entrepreneurship Trainings PDF
Marc O Ali
No ratings yet
Conducting A Feasibility Study
Document11 pages
Conducting A Feasibility Study
Jill Borjal
100% (1)
Quantitative Data Analysis
Document44 pages
Quantitative Data Analysis
Obote Daniel
No ratings yet
Document
Document5 pages
Document
O'Niell Pepa Roma
No ratings yet
Cognitive Dissonance Powerpoint
Document14 pages
Cognitive Dissonance Powerpoint
ancientblackdragon
No ratings yet
Kotler04 Tif A
Document29 pages
Kotler04 Tif A
Behbehlynn
100% (1)
Althea Goddess Headdress - Manuscript Poster-Final Draft
Document1 page
Althea Goddess Headdress - Manuscript Poster-Final Draft
api-240203048
No ratings yet
Global Citizenship Development in Higher Education Institutions - A Systematic Review of The Literature
Document18 pages
Global Citizenship Development in Higher Education Institutions - A Systematic Review of The Literature
Abigail Abanilla
No ratings yet
Impacts of Family Protection Support in The Intervention of Child Protection in The Practice of Social Work
Document9 pages
Impacts of Family Protection Support in The Intervention of Child Protection in The Practice of Social Work
dennielyn caminade
No ratings yet
Advanced Research Method Acfn-5051: Dejene Mamo Bekana (PHD, Assistant Professor)
Document70 pages
Advanced Research Method Acfn-5051: Dejene Mamo Bekana (PHD, Assistant Professor)
lidiya
No ratings yet
T3 QUALITATIVE Vs QUANTITATIVE
Document9 pages
T3 QUALITATIVE Vs QUANTITATIVE
Rajiv-khan Rhap Antanani
No ratings yet
Assessment Brief - BY2 - OL - Unit6 - BDM - January2023 - SB - FV
Document3 pages
Assessment Brief - BY2 - OL - Unit6 - BDM - January2023 - SB - FV
Sama
No ratings yet
More Statistical and Methodological Myths and Urban Legends
Document368 pages
More Statistical and Methodological Myths and Urban Legends
Jonathan Jones
No ratings yet
Loyens, 2008
Document18 pages
Loyens, 2008
CatharinaWidiartini
No ratings yet
Sa3 - 2
Document72 pages
Sa3 - 2
Cherry Jara
No ratings yet
Ijert Ijert: Analysis & Design of Fire Damage Structure
Document6 pages
Ijert Ijert: Analysis & Design of Fire Damage Structure
Amar Wadood
No ratings yet
Format For Proposals
Document5 pages
Format For Proposals
salllll
No ratings yet
Tadashi Higasa: Japanese Urban Planner University of Tokyo July 1920-October 1997
Document6 pages
Tadashi Higasa: Japanese Urban Planner University of Tokyo July 1920-October 1997
JohnDominicMorales
No ratings yet
Working Capital Management and Firm Performance-4
Document16 pages
Working Capital Management and Firm Performance-4
Isaac Mwangi
No ratings yet
Practical Research 2 Chapter 1-5
Document25 pages
Practical Research 2 Chapter 1-5
Jethro Briza Ganelo
100% (6)
HRM Job Analysis
Document11 pages
HRM Job Analysis
forumvajir
75% (4)