Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
By Artem Kovera
()
About this ebook
Machine Learning Made Easy to Understand with Clustering Algorithms.
Clustering algorithms are commonly used in a variety of applications. There are four major tasks for clustering:
- Making simplification for further data processing. In this case, the data is split into different groups which then are processed individually. In business, for instance, we can find different groups of customers sharing some similar features using cluster analysis. Then, we can use this information to develop different marketing strategies and apply them to all these separate groups of customers. Or, we can cluster a marketplace in a specific niche to find what kinds of products are selling better than other ones to make a decision what kind of products to produce. Usually, clustering is one of the first techniques that help explore a dataset we are going to work with to get some sense of the structure of the data.
- Compression of the data. We can implement cluster analysis on a giant data set. Then from each cluster, we can pick just several items. In this case, we usually lose much less information than in the case where we pick data points without preceding clustering. Clustering algorithms are being used to compress not only large data sets but also relatively small objects like images.
- Picking out unusual data points from the dataset. This procedure is done, for example, for the detection of fraudulent transactions with credit cards. In medicine, similar procedures can be used, for example, to identify new forms of illnesses.
- Building a hierarchy of objects. This is implemented for classification of biological organisms. It is also applied, for example, in search engines to group different text documents inside the search engines’ datasets.
In an introductory chapter, you will find:
- Different types of machine learning;
- Features in datasets;
- Dimensionality of datasets;
- The ‘curse’ of dimensionality;
- Dealing with underfitting and overfitting
In the following chapters, we will implement these concepts in practice, working with clustering algorithms.
This e-book provides detailed explanations of several widely-used clustering approaches with visual representations:
Hierarchical agglomerative clustering;
K-means;
DBSCAN;
Neural network-based clustering
You will learn different strengths and weaknesses of these algorithms as well as the practical strategies to overcome the weaknesses. In addition, we will briefly touch upon some other clustering methods.
This book mostly focuses on how the algorithms work behind the scenes. However, there is some code in this book. The examples of the algorithms are presented in Python 3.
Related to Machine Learning with Clustering
Related ebooks
Practical Machine Learning for Data Analysis Using Python Rating: 0 out of 5 stars0 ratingsMachine Learning: Adaptive Behaviour Through Experience: Thinking Machines Rating: 4 out of 5 stars4/5Deep Learning with Keras: Beginner’s Guide to Deep Learning with Keras Rating: 3 out of 5 stars3/5Python Machine Learning: Introduction to Machine Learning with Python Rating: 0 out of 5 stars0 ratingsMachine Learning Interview Questions Rating: 5 out of 5 stars5/5Applied Predictive Modeling: An Overview of Applied Predictive Modeling Rating: 0 out of 5 stars0 ratingsMachine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4 Rating: 0 out of 5 stars0 ratingsFeature Selection in Machine Learning with Python Rating: 0 out of 5 stars0 ratingsPYTHON FOR DATA ANALYSIS: A Practical Guide to Manipulating, Cleaning, and Analyzing Data Using Python (2023 Beginner Crash Course) Rating: 0 out of 5 stars0 ratingsBuilding Machine Learning Systems with Python Rating: 4 out of 5 stars4/5Designing Machine Learning Systems with Python Rating: 0 out of 5 stars0 ratingsData Science: Concepts and Practice Rating: 3 out of 5 stars3/5Practical Data Analysis Rating: 4 out of 5 stars4/5Julia Cookbook Rating: 0 out of 5 stars0 ratingsPYTHON DATA SCIENCE: A Practical Guide to Mastering Python for Data Science and Artificial Intelligence (2023 Beginner Crash Course) Rating: 0 out of 5 stars0 ratingsMachine Learning Algorithms for Data Scientists: An Overview Rating: 0 out of 5 stars0 ratingsBuilding a Recommendation System with R Rating: 0 out of 5 stars0 ratingsNumPy Cookbook Rating: 5 out of 5 stars5/5Regression Analysis with Python Rating: 0 out of 5 stars0 ratingsConformal Prediction for Reliable Machine Learning: Theory, Adaptations and Applications Rating: 0 out of 5 stars0 ratingsHands-on Time Series Analysis with Python: From Basics to Bleeding Edge Techniques Rating: 5 out of 5 stars5/5
Mathematics For You
Algebra - The Very Basics Rating: 5 out of 5 stars5/5Basic Math & Pre-Algebra For Dummies Rating: 4 out of 5 stars4/5Introducing Game Theory: A Graphic Guide Rating: 4 out of 5 stars4/5Mental Math Secrets - How To Be a Human Calculator Rating: 5 out of 5 stars5/5Game Theory: A Simple Introduction Rating: 4 out of 5 stars4/5Algebra I Workbook For Dummies Rating: 3 out of 5 stars3/5Calculus For Dummies Rating: 4 out of 5 stars4/5Geometry For Dummies Rating: 5 out of 5 stars5/5Basic Math Notes Rating: 5 out of 5 stars5/5Quantum Physics for Beginners Rating: 4 out of 5 stars4/5The Everything Guide to Algebra: A Step-by-Step Guide to the Basics of Algebra - in Plain English! Rating: 4 out of 5 stars4/5ACT Math & Science Prep: Includes 500+ Practice Questions Rating: 3 out of 5 stars3/5My Best Mathematical and Logic Puzzles Rating: 5 out of 5 stars5/5The Everything Everyday Math Book: From Tipping to Taxes, All the Real-World, Everyday Math Skills You Need Rating: 5 out of 5 stars5/5The Golden Ratio: The Divine Beauty of Mathematics Rating: 5 out of 5 stars5/5The Elements of Euclid for the Use of Schools and Colleges (Illustrated) Rating: 0 out of 5 stars0 ratingsRelativity: The special and the general theory Rating: 5 out of 5 stars5/5See Ya Later Calculator: Simple Math Tricks You Can Do in Your Head Rating: 4 out of 5 stars4/5The Thirteen Books of the Elements, Vol. 1 Rating: 0 out of 5 stars0 ratingsA Mind for Numbers | Summary Rating: 4 out of 5 stars4/5GED® Math Test Tutor, 2nd Edition Rating: 0 out of 5 stars0 ratingsHow Not To Be Wrong | Summary Rating: 5 out of 5 stars5/5Calculus Made Easy Rating: 4 out of 5 stars4/5The Little Book of Mathematical Principles, Theories & Things Rating: 3 out of 5 stars3/5Is God a Mathematician? Rating: 4 out of 5 stars4/5Algebra I For Dummies Rating: 4 out of 5 stars4/5
Reviews for Machine Learning with Clustering
0 ratings0 reviews
Book preview
Machine Learning with Clustering - Artem Kovera
Introduction to machine learning and clustering
Hierarchical clustering
1.The main idea and advantages/disadvantages of the algorithm
2.Different metrics for computing the distance between clusters
3.Hierarchical agglomerative clustering using the SciPy library
K-means algorithm
1.The major principles of the algorithm
2.Implementing k-means using the Scikit-learn library
3.The disadvantages of k-means and methods to overcome them
4.Introduction to the expectation–maximization (EM) algorithm
DBSCAN
1.The major principles of the algorithm
2.Implementing DBSCAN using the Scikit-learn library
3.Advantages and disadvantages of DBSCAN
Neural network-based clustering
1.General idea of clustering using artificial neural networks
2.Constructing a simple neural net for clustering using Numpy arrays
3.Introduction to self-organizing maps
Introduction to machine learning and clustering
The amount of data in digital format has been growing exponentially in the last decades, and this tendency will certainly continue. Currently, data is a very valuable resource. For example, most companies extensively collect various data, and some companies even sell data to other companies. Apparently, the success of any business in the near future will be largely determined by the efficiency of working with large amounts of data. But the data deluge is relevant not only to business; it is also extremely widespread in many other areas, such as science, education, medicine, state governance, and many others.
The discipline specifically designed to work with all sorts of data has been known for a long time. It is called statistics. However, traditional statistical approaches cannot be successfully applied to large amounts of data that we encounter today without using electronic computational devices. And here comes to the rescue our hero – machine learning. Machine learning can be viewed as a form of applied statistics for solving various optimization problems using computer algorithms. An algorithm is just a set of instructions.
Even more importantly, machine learning gives computer programs another remarkable ability – the ability to adapt to changes in an environment and learn from experience. This is the most important feature of machine learning.
In a traditional programming approach, we explicitly give a computer a set of instructions to execute. In machine learning, we also give instructions (a machine learning algorithm) to a computer, but the algorithm generates a model on the data given to the computer, and then this model can make predictions on new data. That is how the learning from previous experience comes into play. As the algorithm gets more data,