Exploratory and Multivariate Data Analysis

Ebook743 pages4 hours

Exploratory and Multivariate Data Analysis

Name: Exploratory and Multivariate Data Analysis
Author: Michel Jambu
ISBN: 9780080923673

By Michel Jambu

Rating: 0 out of 5 stars

()

Read preview

About this ebook

With a useful index of notations at the beginning, this book explains and illustrates the theory and application of data analysis methods from univariate to multidimensional and how to learn and use them efficiently. This book is well illustrated and is a useful and well-documented review of the most important data analysis techniques.

Describes, in detail, exploratory data analysis techniques from the univariate to the multivariate ones
Features a complete description of correspondence analysis and factor analysis techniques as multidimensional statistical data analysis techniques, illustrated with concrete and understandable examples
Includes a modern and up-to-date description of clustering algorithms with many properties which gives a new role of clustering in data analysis techniques

Skip carousel

Mathematics

LanguageEnglish

PublisherAcademic Press

Release dateSep 9, 1991

ISBN9780080923673

Author

Michel Jambu

Related authors

Skip carousel

Related to Exploratory and Multivariate Data Analysis

Titles in the series (8)

Skip carousel

Statistical Methods in Longitudinal Research: Principles and Structuring Change
Ebook
Statistical Methods in Longitudinal Research: Principles and Structuring Change
byAlexander von Eye
Rating: 0 out of 5 stars
0 ratings
Extreme Value Theory in Engineering
Ebook
Extreme Value Theory in Engineering
byEnrique Castillo
Rating: 0 out of 5 stars
0 ratings
Statistical Methods in Longitudinal Research: Principles and Structuring Change
Ebook
Statistical Methods in Longitudinal Research: Principles and Structuring Change
byAlexander von Eye
Rating: 0 out of 5 stars
0 ratings
Statistical Reasoning in Law and Public Policy: Tort Law, Evidence and Health
Ebook
Statistical Reasoning in Law and Public Policy: Tort Law, Evidence and Health
byJoseph L. Gastwirth
Rating: 0 out of 5 stars
0 ratings
Optimization Techniques in Statistics
Ebook
Optimization Techniques in Statistics
byJagdish S. Rustagi
Rating: 0 out of 5 stars
0 ratings
Statistical Methods in Longitudinal Research: Time Series and Categorical Longitudinal Data
Ebook
Statistical Methods in Longitudinal Research: Time Series and Categorical Longitudinal Data
byAlexander von Eye
Rating: 0 out of 5 stars
0 ratings
Sample Size Methodology
Ebook
Sample Size Methodology
byM. M. Desu
Rating: 1 out of 5 stars
1/5
Exploratory and Multivariate Data Analysis
Ebook
Exploratory and Multivariate Data Analysis
byMichel Jambu
Rating: 0 out of 5 stars
0 ratings

Related ebooks

Skip carousel

Applied Statistical Modeling and Data Analytics: A Practical Guide for the Petroleum Geosciences
Ebook
Applied Statistical Modeling and Data Analytics: A Practical Guide for the Petroleum Geosciences
bySrikanta Mishra
Rating: 5 out of 5 stars
5/5
Handbook of Statistical Analysis and Data Mining Applications
Ebook
Handbook of Statistical Analysis and Data Mining Applications
byRobert Nisbet
Rating: 4 out of 5 stars
4/5
Data Mining Applications with R
Ebook
Data Mining Applications with R
byYanchang Zhao
Rating: 4 out of 5 stars
4/5
Time Series Analysis in the Social Sciences: The Fundamentals
Ebook
Time Series Analysis in the Social Sciences: The Fundamentals
byYouseop Shin
Rating: 0 out of 5 stars
0 ratings
Categorical Data Analysis Using SAS, Third Edition
Ebook
Categorical Data Analysis Using SAS, Third Edition
byMaura E. Stokes
Rating: 0 out of 5 stars
0 ratings
Surviving Statistics: A Professor's Guide to Getting Through
Ebook
Surviving Statistics: A Professor's Guide to Getting Through
byLuther Maddy
Rating: 0 out of 5 stars
0 ratings
Statistics: Basic Principles and Applications
Ebook
Statistics: Basic Principles and Applications
byRamune B. Adams
Rating: 0 out of 5 stars
0 ratings
Biostatistics and Computer-based Analysis of Health Data using Stata
Ebook
Biostatistics and Computer-based Analysis of Health Data using Stata
byChristophe Lalanne
Rating: 0 out of 5 stars
0 ratings
Practical Data Analysis
Ebook
Practical Data Analysis
byHector Cuesta
Rating: 4 out of 5 stars
4/5
Introductory Statistics for the Behavioral Sciences: Workbook
Ebook
Introductory Statistics for the Behavioral Sciences: Workbook
byRobert B. Ewen
Rating: 5 out of 5 stars
5/5
Thinking Statistically
Ebook
Thinking Statistically
byAnthony Banfield
Rating: 5 out of 5 stars
5/5
Schaum's Outline of Elements of Statistics I: Descriptive Statistics and Probability
Ebook
Schaum's Outline of Elements of Statistics I: Descriptive Statistics and Probability
byStephen Bernstein
Rating: 0 out of 5 stars
0 ratings
Business Statistics I Essentials
Ebook
Business Statistics I Essentials
byLouise Clark
Rating: 5 out of 5 stars
5/5
Descriptive Statistics: Six Sigma Thinking, #3
Ebook
Descriptive Statistics: Six Sigma Thinking, #3
bySumeet Savant
Rating: 0 out of 5 stars
0 ratings
Data Preparation and Exploration: Applied to Healthcare Data
Ebook
Data Preparation and Exploration: Applied to Healthcare Data
byRobert Hoyt
Rating: 0 out of 5 stars
0 ratings
Statistics for Physical Sciences: An Introduction
Ebook
Statistics for Physical Sciences: An Introduction
byBrian Martin
Rating: 0 out of 5 stars
0 ratings
Machine Learning and Data Mining
Ebook
Machine Learning and Data Mining
byIgor Kononenko
Rating: 3 out of 5 stars
3/5
Biostatistics and Computer-based Analysis of Health Data Using SAS
Ebook
Biostatistics and Computer-based Analysis of Health Data Using SAS
byChristophe Lalanne
Rating: 0 out of 5 stars
0 ratings
Mastering Scientific Computing with R
Ebook
Mastering Scientific Computing with R
byPaul Gerrard
Rating: 3 out of 5 stars
3/5
Statistics
Ebook
Statistics
byH. T. Hayslett
Rating: 4 out of 5 stars
4/5
Beginning Statistics with Data Analysis
Ebook
Beginning Statistics with Data Analysis
byFrederick Mosteller
Rating: 4 out of 5 stars
4/5
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
Ebook
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
byLee Baker
Rating: 0 out of 5 stars
0 ratings
Fundamentals of Statistics
Ebook
Fundamentals of Statistics
byH. Mulholland
Rating: 5 out of 5 stars
5/5
Excel Statistics: Step by Step
Ebook
Excel Statistics: Step by Step
byStephanie Glen
Rating: 4 out of 5 stars
4/5
R Programming - a Comprehensive Guide: Software
Ebook
R Programming - a Comprehensive Guide: Software
byEditor IJSMI
Rating: 0 out of 5 stars
0 ratings
Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
Ebook
Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
byJim Frost
Rating: 5 out of 5 stars
5/5
The Practically Cheating Statistics Handbook, The Sequel! (2nd Edition)
Ebook
The Practically Cheating Statistics Handbook, The Sequel! (2nd Edition)
byS. Deviant
Rating: 5 out of 5 stars
5/5
BAYES Theorem
Ebook
BAYES Theorem
byJeffery Short
Rating: 2 out of 5 stars
2/5
IBM SPSS Statistics 21 Brief Guide
Ebook
IBM SPSS Statistics 21 Brief Guide
byIBM Corporation
Rating: 0 out of 5 stars
0 ratings
Understanding Statistics: An Introduction
Ebook
Understanding Statistics: An Introduction
byAntony Davies
Rating: 0 out of 5 stars
0 ratings

Mathematics For You

Skip carousel

Quantum Physics for Beginners
Ebook
Quantum Physics for Beginners
byMax Thomson
Rating: 4 out of 5 stars
4/5
Precalculus: A Self-Teaching Guide
Ebook
Precalculus: A Self-Teaching Guide
bySteve Slavin
Rating: 5 out of 5 stars
5/5
Algebra - The Very Basics
Ebook
Algebra - The Very Basics
byMetin Bektas
Rating: 5 out of 5 stars
5/5
Calculus For Dummies
Ebook
Calculus For Dummies
byMark Ryan
Rating: 4 out of 5 stars
4/5
The Golden Ratio: The Divine Beauty of Mathematics
Ebook
The Golden Ratio: The Divine Beauty of Mathematics
byGary B. Meisner
Rating: 5 out of 5 stars
5/5
Game Theory: A Simple Introduction
Ebook
Game Theory: A Simple Introduction
byK.H. Erickson
Rating: 4 out of 5 stars
4/5
Mental Math: How to Develop a Mind for Numbers, Rapid Calculations and Creative Math Tricks (Including Special Speed Math for SAT, GMAT and GRE Students)
Ebook
Mental Math: How to Develop a Mind for Numbers, Rapid Calculations and Creative Math Tricks (Including Special Speed Math for SAT, GMAT and GRE Students)
byJoseph White
Rating: 0 out of 5 stars
0 ratings
The Everything Guide to Algebra: A Step-by-Step Guide to the Basics of Algebra - in Plain English!
Ebook
The Everything Guide to Algebra: A Step-by-Step Guide to the Basics of Algebra - in Plain English!
byChristopher Monahan
Rating: 4 out of 5 stars
4/5
Algebra I Workbook For Dummies
Ebook
Algebra I Workbook For Dummies
byMary Jane Sterling
Rating: 3 out of 5 stars
3/5
Basic Math & Pre-Algebra For Dummies
Ebook
Basic Math & Pre-Algebra For Dummies
byMark Zegarelli
Rating: 4 out of 5 stars
4/5
The Thirteen Books of the Elements, Vol. 1
Ebook
The Thirteen Books of the Elements, Vol. 1
byEuclid
Rating: 0 out of 5 stars
0 ratings
Sneaky Math: A Graphic Primer with Projects
Ebook
Sneaky Math: A Graphic Primer with Projects
byCy Tymony
Rating: 0 out of 5 stars
0 ratings
Build a Mathematical Mind - Even If You Think You Can't Have One: Become a Pattern Detective. Boost Your Critical and Logical Thinking Skills.
Ebook
Build a Mathematical Mind - Even If You Think You Can't Have One: Become a Pattern Detective. Boost Your Critical and Logical Thinking Skills.
byAlbert Rutherford
Rating: 5 out of 5 stars
5/5
Mental Math Secrets - How To Be a Human Calculator
Ebook
Mental Math Secrets - How To Be a Human Calculator
byRandy Silverman
Rating: 5 out of 5 stars
5/5
The Everything Guide to Pre-Algebra: A Helpful Practice Guide Through the Pre-Algebra Basics - in Plain English!
Ebook
The Everything Guide to Pre-Algebra: A Helpful Practice Guide Through the Pre-Algebra Basics - in Plain English!
byJane Cassie
Rating: 5 out of 5 stars
5/5
The Everything Everyday Math Book: From Tipping to Taxes, All the Real-World, Everyday Math Skills You Need
Ebook
The Everything Everyday Math Book: From Tipping to Taxes, All the Real-World, Everyday Math Skills You Need
byChristopher Monahan
Rating: 5 out of 5 stars
5/5
Calculus Made Easy
Ebook
Calculus Made Easy
bySilvanus P. Thompson
Rating: 4 out of 5 stars
4/5
The Little Book of Mathematical Principles, Theories & Things
Ebook
The Little Book of Mathematical Principles, Theories & Things
byRobert Solomon
Rating: 3 out of 5 stars
3/5
Is God a Mathematician?
Ebook
Is God a Mathematician?
byMario Livio
Rating: 4 out of 5 stars
4/5
Introducing Game Theory: A Graphic Guide
Ebook
Introducing Game Theory: A Graphic Guide
byIvan Pastine
Rating: 4 out of 5 stars
4/5
Statistics 101: From Data Analysis and Predictive Modeling to Measuring Distribution and Determining Probability, Your Essential Guide to Statistics
Ebook
Statistics 101: From Data Analysis and Predictive Modeling to Measuring Distribution and Determining Probability, Your Essential Guide to Statistics
byDavid Borman
Rating: 4 out of 5 stars
4/5
Relativity: The special and the general theory
Ebook
Relativity: The special and the general theory
byAlbert Einstein
Rating: 5 out of 5 stars
5/5
This is The Statistics Handbook your Professor Doesn't Want you to See. So Easy, it's Practically Cheating...
Ebook
This is The Statistics Handbook your Professor Doesn't Want you to See. So Easy, it's Practically Cheating...
byS. Deviant
Rating: 4 out of 5 stars
4/5
Practice Makes Perfect Algebra II Review and Workbook, Second Edition
Ebook
Practice Makes Perfect Algebra II Review and Workbook, Second Edition
byChristopher Monahan
Rating: 4 out of 5 stars
4/5
See Ya Later Calculator: Simple Math Tricks You Can Do in Your Head
Ebook
See Ya Later Calculator: Simple Math Tricks You Can Do in Your Head
byEditors of Portable Press
Rating: 4 out of 5 stars
4/5
The Math of Life and Death: 7 Mathematical Principles That Shape Our Lives
Ebook
The Math of Life and Death: 7 Mathematical Principles That Shape Our Lives
byKit Yates
Rating: 4 out of 5 stars
4/5
Algebra I For Dummies
Ebook
Algebra I For Dummies
byMary Jane Sterling
Rating: 4 out of 5 stars
4/5
Mathematical Thinking - For People Who Hate Math: Level Up Your Analytical and Creative Thinking Skills. Excel at Problem-Solving and Decision-Making.
Ebook
Mathematical Thinking - For People Who Hate Math: Level Up Your Analytical and Creative Thinking Skills. Excel at Problem-Solving and Decision-Making.
byAlbert Rutherford
Rating: 3 out of 5 stars
3/5
ACT Math & Science Prep: Includes 500+ Practice Questions
Ebook
ACT Math & Science Prep: Includes 500+ Practice Questions
byKaplan Test Prep
Rating: 3 out of 5 stars
3/5
Summary of The Black Swan: by Nassim Nicholas Taleb | Includes Analysis
Ebook
Summary of The Black Swan: by Nassim Nicholas Taleb | Includes Analysis
byInstaread Summaries
Rating: 5 out of 5 stars
5/5

Related podcast episodes

Skip carousel

[Bite] Data Science and the Scientific Method
Podcast episode
[Bite] Data Science and the Scientific Method
byDataCafé
0 ratings
0% found this document useful
058R_An adaptive learning process for developing and applying sustainability indicators with local communities (research summary)
Podcast episode
058R_An adaptive learning process for developing and applying sustainability indicators with local communities (research summary)
byWhat is The Future for Cities?
0 ratings
0% found this document useful
Keeping ourselves honest when we work with observational healthcare data: The abundance of data in healthcare, and the valu…
Podcast episode
Keeping ourselves honest when we work with observational healthcare data: The abundance of data in healthcare, and the valu…
byLinear Digressions
0 ratings
0% found this document useful
070R_Citizen-centred big data analysis-driven governance intelligence framework for smart cities (research summary)
Podcast episode
070R_Citizen-centred big data analysis-driven governance intelligence framework for smart cities (research summary)
byWhat is The Future for Cities?
0 ratings
0% found this document useful
005R_Smart cities, big data and urban policy: Towards urban analytics for the long run (research summary)
Podcast episode
005R_Smart cities, big data and urban policy: Towards urban analytics for the long run (research summary)
byWhat is The Future for Cities?
0 ratings
0% found this document useful
Kathryn Leonard and Axel Carlier on Crowdsourcing for Math Research: We’re still celebrating Mathematical and Statistical Awareness Month here at Carry the Two. This time, we’re taking a look at how anyone can get involved with research and help move mathematics (or statistics) forward. We explore the differences between citizen science, community science, and crowd sourcing and how one group of researchers used an international scavenger hunt to collect data. Find our transcript here: LINK Curious to learn more? Check out these additional links: Peer-reviewed article of today’s paper: The 2D shape structure dataset: A user annotated open access database - https://www.sciencedirect.com/science/article/pii/S0097849316300528 Follow-up research: T. Blanc-Beyne, G. Morin, K. Leonard, A. Carlier, S. Hahmann, A Salience Measure for 3D Shape Decomposition and Sub-parts Classification, Graphical Models 99:22-30, September 2018. K. Leonard, G. Morin, S. Hahmann, A. Carlier,
Podcast episode
Kathryn Leonard and Axel Carlier on Crowdsourcing for Math Research: We’re still celebrating Mathematical and Statistical Awareness Month here at Carry the Two. This time, we’re taking a look at how anyone can get involved with research and help move mathematics (or statistics) forward. We explore the differences between citizen science, community science, and crowd sourcing and how one group of researchers used an international scavenger hunt to collect data. Find our transcript here: LINK Curious to learn more? Check out these additional links: Peer-reviewed article of today’s paper: The 2D shape structure dataset: A user annotated open access database - https://www.sciencedirect.com/science/article/pii/S0097849316300528 Follow-up research: T. Blanc-Beyne, G. Morin, K. Leonard, A. Carlier, S. Hahmann, A Salience Measure for 3D Shape Decomposition and Sub-parts Classification, Graphical Models 99:22-30, September 2018. K. Leonard, G. Morin, S. Hahmann, A. Carlier,
byCarry the Two
0 ratings
0% found this document useful
48. Big Data Wrangling for Core Sensing Technology
Podcast episode
48. Big Data Wrangling for Core Sensing Technology
byDiscovery to Recovery
0 ratings
0% found this document useful
083R_Operationalising a concept: The systematic review of composite indicator building for measuring community disaster resilience (research summary)
Podcast episode
083R_Operationalising a concept: The systematic review of composite indicator building for measuring community disaster resilience (research summary)
byWhat is The Future for Cities?
0 ratings
0% found this document useful
4 + 1 Model of Data Science: Before diving into the complex world of data science it seemed to wise to establish a shared definition of the field. Here at the UVA School of Data Science, we have defined data science with the 4 + 1 Model. This model serves an outline for the first series of UVA Data Points. It also serves as a guiding definition within the School of Data Science, touching everything from research to course planning. In this introduction trailer, host Monica Manney discusses the history, development, and function of the 4 + 1 Model of Data Science with its main author, Raf Alvarado. Below is a brief expect from An Outline of the 4 + 1 Model of Data Science by Raf Alvarado: “The point of the 4 + 1 model, abstract as it is, is to provide a practical template for strategically planning the various elements of a school of data science. To serve as an effective template, a model must be general. But generality if often purchased at the cost of intuitive understanding. The fol
Podcast episode
4 + 1 Model of Data Science: Before diving into the complex world of data science it seemed to wise to establish a shared definition of the field. Here at the UVA School of Data Science, we have defined data science with the 4 + 1 Model. This model serves an outline for the first series of UVA Data Points. It also serves as a guiding definition within the School of Data Science, touching everything from research to course planning. In this introduction trailer, host Monica Manney discusses the history, development, and function of the 4 + 1 Model of Data Science with its main author, Raf Alvarado. Below is a brief expect from An Outline of the 4 + 1 Model of Data Science by Raf Alvarado: “The point of the 4 + 1 model, abstract as it is, is to provide a practical template for strategically planning the various elements of a school of data science. To serve as an effective template, a model must be general. But generality if often purchased at the cost of intuitive understanding. The fol
byUVA Data Points
0 ratings
0% found this document useful
Justin Grimmer et al., "Text as Data: A New Framework for Machine Learning and the Social Sciences" (Princeton UP, 2022): An interview with Justin Grimmer, Brandon M. Stewart, and Margaret E. Roberts
Podcast episode
Justin Grimmer et al., "Text as Data: A New Framework for Machine Learning and the Social Sciences" (Princeton UP, 2022): An interview with Justin Grimmer, Brandon M. Stewart, and Margaret E. Roberts
byNew Books in Sociology
0 ratings
0% found this document useful
Justin Grimmer et al., "Text as Data: A New Framework for Machine Learning and the Social Sciences" (Princeton UP, 2022): An interview with Justin Grimmer, Brandon M. Stewart, and Margaret E. Roberts
Podcast episode
Justin Grimmer et al., "Text as Data: A New Framework for Machine Learning and the Social Sciences" (Princeton UP, 2022): An interview with Justin Grimmer, Brandon M. Stewart, and Margaret E. Roberts
byNew Books in Science, Technology, and Society
0 ratings
0% found this document useful
Justin Grimmer et al., "Text as Data: A New Framework for Machine Learning and the Social Sciences" (Princeton UP, 2022): An interview with Justin Grimmer, Brandon M. Stewart, and Margaret E. Roberts
Podcast episode
Justin Grimmer et al., "Text as Data: A New Framework for Machine Learning and the Social Sciences" (Princeton UP, 2022): An interview with Justin Grimmer, Brandon M. Stewart, and Margaret E. Roberts
byNew Books in Anthropology
0 ratings
0% found this document useful
Justin Grimmer et al., "Text as Data: A New Framework for Machine Learning and the Social Sciences" (Princeton UP, 2022): An interview with Justin Grimmer, Brandon M. Stewart, and Margaret E. Roberts
Podcast episode
Justin Grimmer et al., "Text as Data: A New Framework for Machine Learning and the Social Sciences" (Princeton UP, 2022): An interview with Justin Grimmer, Brandon M. Stewart, and Margaret E. Roberts
byNew Books in Economics
0 ratings
0% found this document useful
Justin Grimmer et al., "Text as Data: A New Framework for Machine Learning and the Social Sciences" (Princeton UP, 2022): An interview with Justin Grimmer, Brandon M. Stewart, and Margaret E. Roberts
Podcast episode
Justin Grimmer et al., "Text as Data: A New Framework for Machine Learning and the Social Sciences" (Princeton UP, 2022): An interview with Justin Grimmer, Brandon M. Stewart, and Margaret E. Roberts
byNew Books in Public Policy
0 ratings
0% found this document useful
Justin Grimmer et al., "Text as Data: A New Framework for Machine Learning and the Social Sciences" (Princeton UP, 2022): An interview with Justin Grimmer, Brandon M. Stewart, and Margaret E. Roberts
Podcast episode
Justin Grimmer et al., "Text as Data: A New Framework for Machine Learning and the Social Sciences" (Princeton UP, 2022): An interview with Justin Grimmer, Brandon M. Stewart, and Margaret E. Roberts
byPrinceton UP Ideas Podcast
0 ratings
0% found this document useful
Justin Grimmer et al., "Text as Data: A New Framework for Machine Learning and the Social Sciences" (Princeton UP, 2022): An interview with Justin Grimmer, Brandon M. Stewart, and Margaret E. Roberts
Podcast episode
Justin Grimmer et al., "Text as Data: A New Framework for Machine Learning and the Social Sciences" (Princeton UP, 2022): An interview with Justin Grimmer, Brandon M. Stewart, and Margaret E. Roberts
byNew Books in Political Science
0 ratings
0% found this document useful
Estimands for Repeated Continuous Outcomes: Interview with Oliver Keene
Podcast episode
Estimands for Repeated Continuous Outcomes: Interview with Oliver Keene
byThe Effective Statistician - in association with PSI
0 ratings
0% found this document useful
Ep. 105: You're Wrong About... Assessment
Podcast episode
Ep. 105: You're Wrong About... Assessment
byMelissa & Lori Love Literacy ™
0 ratings
0% found this document useful
2730: Unveiling the Green Data Blind Spot With NetApp: Today, we're delving into a topic quietly shaping the environmental discourse in the tech world – the ecological impact of data storage. Matt Watts, the Chief Technology Evangelist at NetApp, joins me and brings a wealth of knowledge and experience...
Podcast episode
2730: Unveiling the Green Data Blind Spot With NetApp: Today, we're delving into a topic quietly shaping the environmental discourse in the tech world – the ecological impact of data storage. Matt Watts, the Chief Technology Evangelist at NetApp, joins me and brings a wealth of knowledge and experience...
byThe Tech Talks Daily Podcast
0 ratings
0% found this document useful
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
Podcast episode
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
byNew Books in Business, Management, and Marketing
0 ratings
0% found this document useful
Episode 37: Climate Change Realpolitik, Following the Sams, and Evaluating Research ft. Sarah de Rijcke: With Sarah de Rijcke, Professor in Science, Technology, and Innovation Studies and Scientific Director at the Centre for Science and Technology Studies at Leiden University in the Netherlands
Podcast episode
Episode 37: Climate Change Realpolitik, Following the Sams, and Evaluating Research ft. Sarah de Rijcke: With Sarah de Rijcke, Professor in Science, Technology, and Innovation Studies and Scientific Director at the Centre for Science and Technology Studies at Leiden University in the Netherlands
byThe Received Wisdom
0 ratings
0% found this document useful
BAM 068: Tagging, Data Models, and Data Normalization: Lately, there have been a ton of questions about data tagging and data normalization. But what does that actually mean? This is a super important topic especially when you consider that having common data formats is required to implement analytics,...
Podcast episode
BAM 068: Tagging, Data Models, and Data Normalization: Lately, there have been a ton of questions about data tagging and data normalization. But what does that actually mean? This is a super important topic especially when you consider that having common data formats is required to implement analytics,...
byThe Smart Buildings Academy Podcast | Teaching You Building Automation, Systems Integration, and Information Technology
0 ratings
0% found this document useful
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
Podcast episode
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
byNew Books in Economics
0 ratings
0% found this document useful
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
Podcast episode
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
byNew Books in Science, Technology, and Society
0 ratings
0% found this document useful
NAP7 Baseline Survey: There are two papers we are discussing today and they are the first in a series of results from probably the most important piece of peri-operative research from 2023 – NAP7! The first paper reports results from the local coordinator baseline survey....
Podcast episode
NAP7 Baseline Survey: There are two papers we are discussing today and they are the first in a series of results from probably the most important piece of peri-operative research from 2023 – NAP7! The first paper reports results from the local coordinator baseline survey....
byThe Anaesthesia Journal Podcast
0 ratings
0% found this document useful
Four Most Commonly Asked Questions About AI with Dr. Jerry Smith: Dr. Jerry Smith welcomes you to another episode of AI Live and Unbiased to explore the breadth and depth of Artificial Intelligence and to encourage you to change the world, not just observe it! Dr. Jerry is talking today about questions and...
Podcast episode
Four Most Commonly Asked Questions About AI with Dr. Jerry Smith: Dr. Jerry Smith welcomes you to another episode of AI Live and Unbiased to explore the breadth and depth of Artificial Intelligence and to encourage you to change the world, not just observe it! Dr. Jerry is talking today about questions and...
byAI Live & Unbiased
0 ratings
0% found this document useful
Optimising the Future
Podcast episode
Optimising the Future
byDataCafé
0 ratings
0% found this document useful
Alignment Newsletter #168: Four technical topics for which Open Phil is soliciting grant proposals: Four technical topics for which Open Phil is soliciting grant proposals
Podcast episode
Alignment Newsletter #168: Four technical topics for which Open Phil is soliciting grant proposals: Four technical topics for which Open Phil is soliciting grant proposals
byAlignment Newsletter Podcast
0 ratings
0% found this document useful
34. Denise Gosnell and Matthias Broecheler - You should really learn about graph databases. Here’s why.
Podcast episode
34. Denise Gosnell and Matthias Broecheler - You should really learn about graph databases. Here’s why.
byTowards Data Science
0 ratings
0% found this document useful
Nick Huntington-Klein, "The Effect: An Introduction to Research Design and Causality" (CRC Press, 2021): An interview with Nick Huntington-Klein
Podcast episode
Nick Huntington-Klein, "The Effect: An Introduction to Research Design and Causality" (CRC Press, 2021): An interview with Nick Huntington-Klein
byNew Books in Public Policy
0 ratings
0% found this document useful

Skip carousel

Want A Job In Data Science? You Might Have To Take A Standardized Test When Applying
Chicago Tribune
Article
Want A Job In Data Science? You Might Have To Take A Standardized Test When Applying
Jul 10, 2018
3 min read
Why I Hate Monte Carlo Analysis and Other Financial Projections
Kiplinger
Article
Why I Hate Monte Carlo Analysis and Other Financial Projections
Aug 14, 2019
I'm not a fan of financial plans that use straight-line projections or Monte Carlo risk analysis to support investment proposals. Here's why: They can lull people into a false sense of security or lead them to believe that the stock market is the ans
5 min read
Data Centers Aren’t The Energy Hogs We Thought
Futurity
Article
Data Centers Aren’t The Energy Hogs We Thought
Feb 28, 2020
2 min read
How And Where You Use Machine-learning
APC
Article
How And Where You Use Machine-learning
Oct 7, 2019
4 min read
Electronic Data Analysis Key To Agri Economics
Farmer's Weekly
Article
Electronic Data Analysis Key To Agri Economics
Nov 9, 2020
Collecting and analysing electronically generated data enable agricultural economists to compile meaningful recommendations for end-users in the agriculture sector. Data collection and analyses were increasingly being made easier, due to the developm
1 min read
How Quickly Do Large Language Models Learn Unexpected Skills?
Nautilus
Article
How Quickly Do Large Language Models Learn Unexpected Skills?
Mar 8, 2024
4 min read
A Method Of Foresight In The Field Of Strategy Innovation
The European Business Review
Article
A Method Of Foresight In The Field Of Strategy Innovation
Feb 25, 2021
This research consisted of two phases. Phase 1. Information gathering: systematic collection of relevant signals and trends, informed by academic knowledge and consulting company reports. Phase 2. Diagnosis is a three-step exercise: In-Depth Ana
1 min read
Forecasts For Covid-19 Based On Artificial Intelligence
Frontiers of Science
Article
Forecasts For Covid-19 Based On Artificial Intelligence
Apr 21, 2020
3 min read
Deconstructing Management Analytics
Rotman Management
Article
Deconstructing Management Analytics
Sep 1, 2022
7 min read
The Lawlessness of Large Numbers
Nautilus
Article
The Lawlessness of Large Numbers
Jul 27, 2023
4 min read
Playing With Numbers
India Today
Article
Playing With Numbers
Jul 18, 2019
In the last few years, we have probably created more data digitally than in the rest of human history. Think about the millions of Internet searches and social media posts that are made every minute, and the resultant data that corporations and gover
3 min read
THE WORLD’S BEST Smart Hospitals 2023
Newsweek International
Article
THE WORLD’S BEST Smart Hospitals 2023
Sep 16, 2022
3 min read
Q&A
Rotman Management
Article
Q&A
May 1, 2023
Describe the capability that companies like Netflix, UPS, Amazon and Caesars Entertainment have in common. These are all leading firms in their industries with respect to leveraging analytics as a source of competitive advantage. We now have so much
7 min read
Intelligence Analysis
PRIVATE GAME WILDLIFE RANCHING
Article
Intelligence Analysis
Jun 13, 2018
3 min read
NIH-funded Project Aims To Build A ‘Google’ For Biomedical Data
STAT
Article
NIH-funded Project Aims To Build A ‘Google’ For Biomedical Data
Jul 31, 2019
4 min read
Advancing Healthcare Medical Image Processing
Techfastly
Article
Advancing Healthcare Medical Image Processing
Dec 1, 2021
3 min read
The Infrastructure of an AI Factory
Techfastly
Article
The Infrastructure of an AI Factory
Mar 3, 2021
Data is a crucial element for machine learning algorithms. It can be considered as a fuel of AI factories. Collection of useful data and feeding it into frameworks and models is the foremost step. Data acts as a case or example that the algorithms re
1 min read
How AI Joins The Fight Against Coronavirus
APC
Article
How AI Joins The Fight Against Coronavirus
Apr 20, 2020
4 min read
Machine Learning And Investing: The Cautious Seldom Err Or Write Great Poetry
Finweek - English
Article
Machine Learning And Investing: The Cautious Seldom Err Or Write Great Poetry
Oct 18, 2019
5 min read
Pragmatic Parametricism
Architectural Review Asia Pacific
Article
Pragmatic Parametricism
Nov 13, 2020
4 min read
PEOPLE ASSESSMENT in the Digital Age
The European Business Review
Article
PEOPLE ASSESSMENT in the Digital Age
May 25, 2021
8 min read
Free Flow Of Data: What The Corporate World Can Learn From Science
The European Business Review
Article
Free Flow Of Data: What The Corporate World Can Learn From Science
Jul 31, 2020
8 min read
Is Artificial Intelligence Permanently Inscrutable?: Despite new biology-like tools, some insist interpretation is impossible.
Nautilus
Article
Is Artificial Intelligence Permanently Inscrutable?: Despite new biology-like tools, some insist interpretation is impossible.
Sep 1, 2016
Dmitry Malioutov can’t say much about what he built. As a research scientist at IBM, Malioutov spends part of his time building machine learning systems that solve difficult problems faced by IBM’s corporate clients. One such program was meant for a
13 min read
How Clever Tech Is Changing The Game
Finweek - English
Article
How Clever Tech Is Changing The Game
Oct 18, 2019
3 min read
Nerd’s Notes: How We Did The ClinicalTrials.gov Data Analysis
STAT
Article
Nerd’s Notes: How We Did The ClinicalTrials.gov Data Analysis
Mar 30, 2018
The principles of transparency and replication are as important to us as data journalists as they are to researchers.
5 min read
Is Artificial Intelligence Permanently Inscrutable?
Nautilus
Article
Is Artificial Intelligence Permanently Inscrutable?
Sep 1, 2016
Dmitry Malioutov can’t say much about what he built. As a research scientist at IBM, Malioutov spends part of his time building machine learning systems that solve difficult problems faced by IBM’s corporate clients. One such program was meant for a
13 min read
Small Data
PC Pro Magazine
Article
Small Data
Oct 8, 2022
3 min read
Life Science
Family Tree
Article
Life Science
Jun 27, 2023
6 min read
Top Scientists Revamp Standards To Foster Integrity In Research
NPR
Article
Top Scientists Revamp Standards To Foster Integrity In Research
Apr 11, 2017
3 min read
Putting Artificial Intelligence to Work
Rotman Management
Article
Putting Artificial Intelligence to Work
May 1, 2018
11 min read

Related categories

Skip carousel

Reviews for Exploratory and Multivariate Data Analysis

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Exploratory and Multivariate Data Analysis - Michel Jambu

Exploratory and Multivariate Data Analysis

First Edition

Michel Jambu

National Centre for Telecommunications Studies, Paris, France

ACADEMIC PRESS, INC.

Harcourt Brace Jovanovich, Publishers

Boston San Diego New York

London Sydney Tokyo Toronto

Cover image

Title page

Copyright page

Dedication

Preface

Why this book?

What is in this book?

For whom is the book written?

What the prerequisite knowledge needed?

Acknowledgments

Chapter 1: General Presentation

1 Introduction

2 Examples of Applications

3 Steps in Data Exploration: Management, Analysis, Synthesis

4 Computer Aspects

Chapter 2: Statistical Data Elaboration

1 Statistics

2 Fields of Statistical Data Exploration

3 Statistics and Experiments

4 Data Analysis, Inductive and Deductive Statistics

5 Variables, Statistical Sets, and Data Sets

Chapter 3: 1-D Statistical Data Analysis

1 Introduction

2 1-D Analysis of a Quantitative Variable

3 1-D Analysis of a Categorical Variable

4 1-D Analysis of a Categorical Variable with Multiple Forms

5 1-D Analysis of Time Series or Chronological Variables

6 Statistical Maps or Cartograms

Chapter 4: 2-D Statistical Data Analysis

1 Introduction

2 2-D Analysis of Two Categorical Variables

3 2-D Analysis of Two Quantitative Variables

4 2-D Analysis of a Quantitative Variable and a Categorical Variable

5 2-D Analysis of a Quantitative Variable and a Categorical Variable with Multiple Forms

6 Conclusion

Chapter 5: N-D Statistical Data Analysis

1 Introduction

2 Joint 3-D Statistical Data Analysis

3 Joint N-D Statistical Data Analysis

4 Cartograms and N-D Analysis

Chapter 6: Factor Analysis of Individuals–Variables Data Sets

1 Introduction

2 From Linear Adjustment to Factor Analysis

3 From the Origin of Factor Analysis to Modern Factor Analysis Techniques

4 Mathematical Description of Modern Factor Analysis

5 Factor Analysis Formulas

Chapter 7: Principal Components Analysis

1 Basic Data Sets

2 Different Patterns of Principal Components Analysis

3 Standardized Principal Components Analysis

4 Interpretation of Principal Components Analysis

5 Classifying Supplementary Points into Graphics

6 Rules for Selecting Significant Axes and Elements

7 Standardized Principal Components Analysis Formulas

8 Applications and Case Studies

Chapter 8: 2-D Correspondence Analysis

1 Introduction

2 Basic Correspondence Data Sets

3 Mathematical Description of Correspondence Analysis

4 Geometric Representation of the Sets I and J

5 Interpretation of the 2-D Correspondence Analysis

6 Factor Graphics

7 Classifying Supplementary Points into Graphics

8 Rules for Selecting Significant Axes and Elements

9 2-D Correspondence Analysis Formulas

10 Patterns of Clouds of Points

11 Patterns of Acceptable Data Sets

12 Case Studies

Chapter 9: N-D Correspondence Analysis

1 Introduction

2 Basic Data Sets

3 Equivalence between the Analyses of bJJ and kIJ

4 Interpretation of N-D Correspondence Analysis

5 Factor Graphics

6 Classifying Supplementary Points into Graphics

7 Rules for Selecting Significant Axes and Points of N(I), N(J), and N(Q)

8 N-D Correspondence Analysis Formulas

9 Patterns of Acceptable Data Sets

10 Case Studies

Chapter 10: Classification of Individuals–Variables Data Sets

1 Introduction

2 Basic Data Sets

3 The Mathematical Description of Classifications

4 Partitioning Methods

5 Hierarchical Classification Methods

6 Specific Applications

7 Case Studies

Chapter 11: Classification and Analysis of Proximities Data Sets

1 Introduction

2 Proximities Data Sets

3 Proximities Data Sets from Individuals–Variables Data Sets

4 Elementary Description of Proximities Data Sets

5 Factor Analysis of Proximities Data Sets

6 Classification of Proximities Data Sets

7 Computation of Contributions

8 Conclusion

Chapter 12: Computer Aspects of Exploratory and Multivariate Data Analysis

1 Place of Exploratory and Multivariate Data Analysis in Statistics

2 Basic Features for an Exploratory and Multivariate Data Analysis Software

3 Data Analysis Libraries

4 Future Prospects

Appendix 1: List of Notations

1 General Notations

2 Specific Notations from Chapters 1 to 5

3 Specific Notations from Chapters 6 to 9

4 Specific Notations from Chapters 10 and 11

Appendix 2: Reference Data Sets

1 Cars Models

2 Marks of Students

3 Statistics of Patents Registration

4 Preferences Given by Students

5 Responses to a Questionnaire on New Services in Telecommunications

6 Financial Data Set

7 Measurements Data Set on Skulls

8 Steel Samples

9 Economic Data Set Concerning Investments Abroad

10 Family Timetables

11 Semantic Field Associated with Colors

12 Proximities Data Set from the Family Timetables Data Set

13 Barataria Data of Grain-Size Measurements

14 Quality of Service in the Telephone Network

15 Crimes Data in the United States of America For 1977

16 Table of percentage points of the χ² distribution

References

Author Index

Subject Index

Copyright

No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.

ACADEMIC PRESS, INC.

1250 Sixth Avenue, San Diego, CA 92101

United Kingdom Edition published by

ACADEMIC PRESS LIMITED

24–28 Oval Road, London NW1 7DX

Library of Congress Cataloging-in-Publication Data:

Jambu, Michel.

[Exploration informatique et statistique des données. English]

Exploratory and multivariate data analysis/Michel Jambu.

p. cm.—(Statistical modeling and decision science)

Translation of: Exploration informatique et statistique des

données.

Includes bibliographical references and index.

ISBN 0-12-380090-0 (alk. paper)

I. Mathematical statistics—Data processing. I. Title.

II. Series.

QA276.4.J3613 1991

519.5'0285—dc20 90-23003

CIP

Printed in the United States of America

91 92 93 94 9 8 7 6 5 4 3 2 1

Dedication

To Catherine, Hugo Sébastien, Thomas

L’essence de codage des données est de traduire fidèlement les relations observées entre les choses par des relations entre êtres mathématiques, de telle sorte qu’en réduisant par le calcul la structure mathématique choisie pour image du réel, on ait de celui-ci un dessin simplifié accessible à l’intuition et à la réflexion avec la guarantie d’une critique mathématique.

J.P. Benzécri, in Les Cahicas de l’Analyse, des Données. Vol. II, 1977, no 4, 369–406

Preface

Why this book?

After travelling around the world, studying many kinds of data, listening to many lectures on subjects of data analysis, and giving seminars, it became clear that the way data analysis is studied in France, with exploration by Benzécri and his associates, is actually different from data analysis anywhere else in the world.

When I published Data Analysis and Clustering in 1983, correspond-dence analysis and related topics was known world-wide to French-speaking people but not in the English-speaking world. It was one of the first attempts to present correspondence analysis and associated methods of data analysis to readers of English-reading people. Several colleagues then encouraged me to publish a textbook on correspondence analysis and the French method of data analysis. I was not actually satisfied by this proposal, because data analysis is the same around the world, even if the techniques associated with it vary. Finally, I gathered data analysis materials from different sources. There were so many connections and interactions among them that I combined them in order to propose a modern way of thinking and practicising data analysis; the point is not only to use techniques but to use interactions and relations between them in view of summarizing data for improving knowledge, drawing valid conclusions, and aiding in decision making. The way was found; it remained to write the book.

What is in this book?

The heart of this book contains methods of exploring data from a statistical data analysis point of view, from the most elementary, associated with univariate and bivariate statistical description, to the most advanced, associated with multivariate statistical description, factor analysis, correspondence analysis and clustering. They are presented in such a manner that they correspond to exploration of data sets, step-by-step, to allow readers to build their own data analysis strategies from their data sets. The titles of the chapters and the general plan of the book are as follows: The first chapter presents a general introduction to the basic principles and steps of statistical data analysis with some case studies. The following chapters are presented in the order of the data analysis process: elaboration of data sets (Chapter 2), 1-D statistical data analysis (Chapter 3), 2-D statistical data analysis (Chapter 4), N-D statistical data analysis (Chapter 5), factor analysis of individuals– variables data sets (Chapter 6), principal components analysis (Chapter 7), 2-D correspondence data analysis (Chapter 8), N-D correspondence data analysis (Chapter 9), classification of individuals–variables data sets (Chapter 10), and analysis and classification of proximities data sets (Chapter 11). Chapter 12 is devoted to the computer aspects of data analysis. A list of notations, an appendix containing the data sets used as examples, and as usual, references, conclude the book.

For whom is the book written?

This book is written for anyone who analyzes data or expects to do so in the future, including students, statisticians, scientists, engineers, mana-gers, and teachers. The material presented here is relevant for applica-tions in various fields, such as physics, chemistry, medecine, business, management, marketing, economics, psychology, sociology, geosciences, biology, astronomy, quality control, engineering, computer science, education, linguistics, and virtually any other field where there are data to be analyzed, synthesized, or explored with the goal of improving knowledge or decision making. This book can also be used as a reference for a supplement to any course in applied statistics, or in applied sciences courses where statistics are taught.

What the prerequisite knowledge needed?

Chapters 1–5 do not assume any previous knowledge. The material can be understood by anyone who wants to learn it and who has some experience or interest in quantitative thinking. Chapters 6–9 assume a knowledge of the previous chapters and an understanding of data in terms of interactions between multiple data sets. These chapters are devoted to methods for solving complex problems involving complex data sets. The mathematical background needed is the first level in any linear algebra course. Chapter 10 assumes an interest in taxonomic problems but no specific knowledge, the mathematical background needed is the first level in any university. Chapter 11 assumes a knowledge of Chapters 6-10. It is an introduction to a more general case of data often used in taxonomy and in multidimensional scaling. Chapter 12 assumes an interest in monitoring computer software on real data. It contains some recommendations to users in data analysis. In conclusion, there is no mathematical, statistical, computer knowledge required; just common sense.

Acknowledgments

I would need many pages to thank all the people that have led directly or indirectly to the publication of this book. I have dedicated this book to Professor J. P. Benzecri in acknowledgment of the role he played in my data analysis education. To all those who encouraged me to publish a text-book devoted to data analysis, correspondence analysis, and related topics, I extend my warmest thanks: I. Olkin, C. Hayashi, J. Kruskal, R. Sokal, N. Ohsumi, P. Tukey, J. R. Kettenring, D. Carroll, and D. Merriam, to name a few. Particular thanks are given to H. Teil and F. Murtagh for their critical reading and revising of the manuscript; to G. André, Chief Director of the Centre National d’Etudes des Télécommunications, who controlled efficiently the realization of the manuscript; to the staff of Academic Press for their excellent collabora-tion in passing the book through the press; last, but not least, to Mrs N. Tissédre, for her patient work on the pains-taking preparation of the manuscript. Final thanks go to the Centre National d’Etudes des Télécommunications and the Société Francophone de Classification for their generous financial help, and the S.C.C.M. Inc. for its excellent realization of figures.

Paris, 1990

Chapter 1

General Presentation

1 Introduction

1.1 Aim and Scope of Statistical Data Exploration

The aim of data analysis is to discover the structure of a set of multivariate observations without the assumption of any mathematical hypotheses on the structure of these observations or variables. Because of the size and complexity of the data sets, this structure cannot be discovered directly; specific data processing methods are therefore required to manage, explore, analyze, synthesize, and communicate the results of data processing. These methods are oriented according to the desired goal: improving basic knowledge of a field; diagnosis; forecasting; planning; decision making. Whatever the goal, the statistical features of the observed data sets need to be highlighted. Data analysis methods are the most appropriate ones for doing this.

1.2 What Does Data Mean?

Data is a set of organized information of any type, covering all aspects of a domain related to a specific goal (forecasting, improving knowledge, causal analysis, decision making, etc.). It is a quantification of the real world into an image, acceptable to the human brain, and then to the computer. For example, when the quality of cars is studied, the quality is initially defined in terms of certain criteria; the information concerning these criteria observed on a selected set of cars (a sample) is then gathered. For example, criteria such as mileage, number of repairs, headroom, weight, length, turn circle, and gear ratio are collected and recorded in a data file or data base. All the information is stored in a data set that contains heterogeneous data, in general. Examine the data set given in Table 1.1. It is in the form of the rows and columns of a matrix. Each column and each row has a label; at the intersection of a column and a row is the information related to one variable observed on one car model. Naturally, there are many types of data sets. For example, consider the first column of the data set given in Table 1.1. It concerns the price of cars at a given time. This is a simple, or 1-D, data set as only one variable is observed. The whole data set given in Table 1.1 concerns the simultaneous observation of 12 variables on a given set of cars, and so it is a multiple, or N-D, data set. The complexity of data depends on the field of study and/or on the initial aim, and/or on the degree of detail associated with the study. Thus, the data sets studied by data analysis involve quantitative information (measurements, ratios, marks, indicators, etc) or qualitative (also called categorical) information (categories, logical attributes, intervals of quantitative information, etc.). A data set can involve homogeneous or heterogeneous information. Finally, depending on the goal, a data set can be divided into explanatory and explainable information. Generally, when the domain is large enough, the reference data sets contain all the different types of information. This is true in information systems or data bases. The problem is how to explore and process the data.

Table 1.1

Car models data set (extract).

(From Graphical Methods for Data Analysis, by J.M. Chambers, W.S. Cleveland, B. Kleiner, and P A. Tukey. Copyright © 1983 by Bell Telephone Laboratories Incorporated, Murray Hill, NJ. Reprinted by permission of Wadsworth & Brooks/Cole Advanced Books & Software, Pacific Grove, CA 93950.)

1.3 What Does to analyze data Mean?

To analyze data means to synthesize the content of data in a data base or a data file, by selecting specific data sets on which data analysis methods can be applied. Obviously, no method can analyze a disorganized data set. To be described, data must follow specific rules such as homogeneity, exhaustivity, and comparability. Thus, the first step of data analysis is to extract relevant data sets that can be analyzed whilst having in mind the objectives, which may vary. In an example about the quality of telephone service, the problem is to study levels of quality and to select statistically determined units from a given range of quality. In medicine, the problem is to study how different variables interact on a group of patients. In marketing, the problem is how to forecast the consumer behavior by observing selected variables on selected users. Basically, to analyze data means to choose data sets on which data analysis methods can be applied, with a view to decision making, selection, planning, forecasting, or understanding. And since data are too complex, too large, and too numerous, specific tools are needed to dissect data and to make either numerical or graphical summaries. This specific type of data processing follows a logical process described in Section 3.1 on the different steps of data exploration.

1.4 What Does to synthesize data Mean?

In any statistical study, there are two steps: analysis and synthesis. To synthesize data means to gather the most significant or the most telling features within the data. The results are presented in a way that is convenient for the user. Thus, the problem is not only to analyze data in depth, but also to communicate the results in terms of valid conclusions that can be used to make reasonable decisions. When data analysis was first used, analysis meant both analysis and synthesis. But, according to recent developments in methods, size, and complexity of data, analysis and synthesis must be distinguished again. The basic principles of data analysis are presented here in comparison with other scientific trends.

1.5 Basic Principles

Data analysis belongs to Statistics in the following sense:

Statistics is concerned with scientific methods for collecting, organizing, summarizing, presenting, analyzing data, as well as drawing valid conclusions and making reasonable decisions on the basis of such analysis.

(cf. Spiegel, 1961). It is opposed to experimental methods based on observing the variations of one variable with respect to all the others involved. Statistics and data analysis are based on data as they are collected. All of the possible variations for all of the variables cannot be studied and, most of the time, the control of these variables is impossible, as in economics, marketing, sociology, meteorology, geology, etc. Experimental methods are appropriate for specific classes of measurements. Statistics or data analysis methods can process a larger class of data than those used in experimental methods.

In Statistics, there are two currents: the inductive process and the deductive process. Data analysis is concerned with the deductive process; it means to deduce only from gathered data, and not to build a model first. The basic data analysis principles are expressed as follows:

(a) To extract structures from data, and not the reverse.

(b) To process simultaneously information involving multiple variables.

(d) To use all the resources of a computer, particularly graphical tools.

Certain remarks can be made:

(a) Often the opposite is done; models smooth out data. Thus, it is taken as real what is purely a mathematical construct. It often happens that data are mutilated because it is thought that they cannot be processed by computer. But, it should be kept in mind that methods and software are now able to process data in depth.

(b) To analyze data variable by variable takes time and does not provide a synthesis. To do so, interactions between pieces of information must be studied globally.

(c) Sometimes data are built in successive layers, producing incoherency. Even if data are elaborated independently from data processing, they must be elaborated with a view to data processing.

(d) Graphics give more information than numerical tables. A histogram highlights the shape of a distribution: factor maps give more information than correlation matrices; dispersion box plots represent more than any statistical measures. In the following, some examples of real applications are given.

2 Examples of Applications

2.1 Economic Data: Car Models

To study the economic quality of cars, 37 cars were selected as a representative set. The variables observed were the price, mileage, repair record, headroom, rear seat, trunk space, weight, length, turn circle, displacement and gear ratio. These variables are assumed to influence both the economic quality and the price of a car (the data are given in Table 1.1). Figures 1.1 and 1.2 give the results of principal components analysis and its hierarchical classification performed on the car data set. The principal components analysis highlights two factors, and the resulting factor map shows the cars and the main criteria as points. To the right of the first axis are found the smaller cars and more generally the Japanese ones (Datsun, Honda) with high gear ratio and mileage; to the left of the first axis occur the larger cars and more generally the American cars, which are comfortable (rear seat, trunk space, headroom) but heavy and more expensive than the smaller cars. This is confirmed by the hierarchical classification given in Fig. 1.2.

Figure 1.1 ). Representation in the two first factors (principal components analysis).

Figure 1.2 Car models classification. Hierarchical clustering of a principal components data set.

2.2 Industrial Data: International Evolution of Patents Registration

The number of patents is considered a good indicator of industrial activity. Two branches are studied: the telecommunications branch and all of the branches mixed. The data set is organized into two subsets (cf. Appendix 2, §3) simultaneously analyzed by correspondence analysis. The factorial map is given in Fig. 1.3. This map shows the relative position of each country for its own telecommunication branch with respect to the total of all branches. During the period 1980–1986, the number of patents registered increased for the USA, Japan, Italy, and Sweden, and was stable or decreased for FRG, Great Britain and France. For the telecommunication branch, the movement is expanded. The telecommunication branches of Japan, The Netherlands, USA, and Great Britain are increasing. But the telecommunication branches of France, FRG, Italy, and Switzerland are decreasing. This map is self-explanatory.

Figure 1.3 Statistics concerning the patents registration according to the telecommunications branch and all the branches mixed. Correspondence analysis; representation in the two first factors.

2.3 Marketing Data: Survey Concerning Users of New Services in Telecommunications

To study the behavior and satisfaction (or lack of satisfaction) of users of new services, France Telecom carried out surveys using questionnaires on 1800 people. The new services are the electronic directory and all of the associated distributing services requested using the Minitel, which is a piece of telecommunication equipment resembling to a computer terminal. The questionnaire consists of 70 multiple choice questions. The data set analyzed is a logical data set involving 252 dummy variables and 1800 persons (the dummy variables are the replies to the questions). It was analyzed by N-D correspondence analysis; here, we give two selected graphics, showing a part of the dummy variables. Figure 1.4 represents user satisfaction; Fig. 1.5 represents the usage of the new Minitel services. These two graphics can be superimposed. Data analysis processing of surveys needs more detailed analysis than for contingency data sets. This is will be discussed more in Chapter 9.

Figure 1.4 Questionnaire on Minitel. N -D correspondence analysis; representation in the two first factors. Representation of the variables concerning the usage and the price of the Minitel.

Figure 1.5 Questionnaire on Minitel. N -D correspondence analysis; representation in the two first factors. Representation of the variables concerning the knowledge and usage of the Minitel services.

2.4 Geological Data: Barataria Grain Size Study

Krumbein and Aberdeen (1937) collected 98 bottom samples from the Kidal lagoon in Barataria Bay at the margin of the Mississippi delta, with the objective of evaluating the depositional environment of the lagoon. Data were recorded on the grain-size distribution of the samples (cf. Appendix 2, §13). Only 69 samples (with complete description) are retained for processing by correspondence analysis (cf. Fig. 1.6). The first axis clearly represents the evolution from coarse to fine grained sediments (cf. Teil, 1985).

Figure 1.6 . PH defines the different grain sizes.

2.5 Sociological Data: Family Timetables

In 1965, an international organization wanted to study and compare the lifestyle chosen according to marital status (single or married), sex (male or female), country, and professional activity. In this study, lifestyle was viewed through 10 major activities (professional work, transportation, sleep, household, meals, shopping, children, personal care, TV, leisure). The data set was built by taking into account the number of hours spent by a group on these different activities (cf. Appendix 2, §10). A principal components analysis was done (cf. Fig. 1.7.); it highlights the relationships between the population groups and variables. For example, the second axis opposes the western countries (Europe) to the USA according to two groups of variables: meals and sleep for Europe on the one hand; personal care and shopping on the other hand for USA.

Figure 1.7 Family timetables. Principal components analysis. Representation in the two first factors.

3 Steps in Data Exploration: Management, Analysis, Synthesis

Data analysis involves several steps from data conception to the use of final results in decision making. We present the steps and the relations among them, set in a network where the vertices are the steps and the edges the relations (cf. Fig. 1.8). Ten steps are identified and examined in detail. But, keep in mind that data analysis involves interaction with data and steps taken to analyze them.

Figure 1.8 Data analysis network.

STEP 1. Data decision. At the beginning, there is someone who decides on an action. It could be the manager (in business), the scientist (in fundamental sciences), the physician (in medicine), the agronomist (in studying plants), the decision maker (in marketing), etc. What does he decide? To study a field based on some hypotheses. Therefore, he must define the aim and scope of the study, the boundary of the field, and depending on his knowledge, draw the main features and the orientations of what he wants, and then determine the data expected to be necessary to describe or explain the problem he is trying to solve.

STEP 2. Data conception, data elaboration (Chapter 2). This is a hard

Enjoying the preview?

Page 1 of 1

Exploratory and Multivariate Data Analysis

About this ebook

Michel Jambu

Related authors

Related to Exploratory and Multivariate Data Analysis

Titles in the series (8)

Related ebooks

Mathematics For You

Related podcast episodes

Related articles

Related categories

Reviews for Exploratory and Multivariate Data Analysis

What did you think?

Book preview

Exploratory and Multivariate Data Analysis - Michel Jambu

Table of Contents

Cover image

Title page

Copyright page

Dedication

Preface

Chapter 1: General Presentation

Chapter 2: Statistical Data Elaboration

Chapter 3: 1-D Statistical Data Analysis

Chapter 4: 2-D Statistical Data Analysis

Chapter 5: N-D Statistical Data Analysis

Chapter 6: Factor Analysis of Individuals–Variables Data Sets

Chapter 7: Principal Components Analysis

Chapter 8: 2-D Correspondence Analysis

Chapter 9: N-D Correspondence Analysis

Chapter 10: Classification of Individuals–Variables Data Sets

Chapter 11: Classification and Analysis of Proximities Data Sets

Chapter 12: Computer Aspects of Exploratory and Multivariate Data Analysis

Appendix 1: List of Notations

Appendix 2: Reference Data Sets

References

Author Index

Subject Index

Copyright

Dedication

Why this book?

What is in this book?

For whom is the book written?

What the prerequisite knowledge needed?

Acknowledgments

1 Introduction

1.1 Aim and Scope of Statistical Data Exploration

1.2 What Does Data Mean?

1.3 What Does to analyze data Mean?

1.4 What Does to synthesize data Mean?

1.5 Basic Principles

2 Examples of Applications

2.1 Economic Data: Car Models

2.2 Industrial Data: International Evolution of Patents Registration

2.3 Marketing Data: Survey Concerning Users of New Services in Telecommunications

2.4 Geological Data: Barataria Grain Size Study

2.5 Sociological Data: Family Timetables

3 Steps in Data Exploration: Management, Analysis, Synthesis