Multiple Imputation of Missing Data Using SAS

Ebook386 pages3 hours

Multiple Imputation of Missing Data Using SAS

Name: Multiple Imputation of Missing Data Using SAS
Author: Patricia Berglund
ISBN: 9781629592039

By Patricia Berglund and Steven G. Heeringa

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Find guidance on using SAS for multiple imputation and solving common missing data issues.

Multiple Imputation of Missing Data Using SAS provides both theoretical background and constructive solutions for those working with incomplete data sets in an engaging example-driven format. It offers practical instruction on the use of SAS for multiple imputation and provides numerous examples that use a variety of public release data sets with applications to survey data.

Written for users with an intermediate background in SAS programming and statistics, this book is an excellent resource for anyone seeking guidance on multiple imputation. The authors cover the MI and MIANALYZE procedures in detail, along with other procedures used for analysis of complete data sets. They guide analysts through the multiple imputation process, including evaluation of missing data patterns, choice of an imputation method, execution of the process, and interpretation of results.

Topics discussed include how to deal with missing data problems in a statistically appropriate manner, how to intelligently select an imputation method, how to incorporate the uncertainty introduced by the imputation process, and how to incorporate the complex sample design (if appropriate) through use of the SAS SURVEY procedures.

Discover the theoretical background and see extensive applications of the multiple imputation process in action.

This book is part of the SAS Press program.

Skip carousel

LanguageEnglish

PublisherSAS Institute

Release dateJul 1, 2014

ISBN9781629592039

Author

Patricia Berglund

Patricia Berglund is a Senior Research Associate in the Survey Methodology Program at the University of Michigan Institute for Social Research (ISR). She has extensive experience in the use of SAS for data management and analysis. She is a faculty member in the ISR's Summer Institute in Survey Research Techniques and also directs the ISR's SAS training programs. Berglund also teaches a SAS Business Knowledge Series class titled, "Imputation Techniques in SAS." Her primary research interests are mental health, youth substance issues, and survey methodology.

Related authors

Skip carousel

Related to Multiple Imputation of Missing Data Using SAS

Related ebooks

Skip carousel

PROC REPORT by Example: Techniques for Building Professional Reports Using SAS: Techniques for Building Professional Reports Using SAS
Ebook
PROC REPORT by Example: Techniques for Building Professional Reports Using SAS: Techniques for Building Professional Reports Using SAS
byLisa Fine
Rating: 0 out of 5 stars
0 ratings
The SAS Programmer's PROC REPORT Handbook: Basic to Advanced Reporting Techniques
Ebook
The SAS Programmer's PROC REPORT Handbook: Basic to Advanced Reporting Techniques
byJane Eslinger
Rating: 0 out of 5 stars
0 ratings
SAS Statistics by Example
Ebook
SAS Statistics by Example
byRon Cody
Rating: 5 out of 5 stars
5/5
Biostatistics by Example Using SAS Studio
Ebook
Biostatistics by Example Using SAS Studio
byRon Cody
Rating: 0 out of 5 stars
0 ratings
Implementing CDISC Using SAS: An End-to-End Guide, Revised Second Edition
Ebook
Implementing CDISC Using SAS: An End-to-End Guide, Revised Second Edition
byChris Holland
Rating: 0 out of 5 stars
0 ratings
Analysis of Clinical Trials Using SAS: A Practical Guide, Second Edition
Ebook
Analysis of Clinical Trials Using SAS: A Practical Guide, Second Edition
byCSPtrade2
Rating: 0 out of 5 stars
0 ratings
Fundamentals of Programming in SAS: A Case Studies Approach
Ebook
Fundamentals of Programming in SAS: A Case Studies Approach
byJames Blum
Rating: 0 out of 5 stars
0 ratings
The SAS Programmer's PROC REPORT Handbook: ODS Companion
Ebook
The SAS Programmer's PROC REPORT Handbook: ODS Companion
byJane Eslinger
Rating: 0 out of 5 stars
0 ratings
Logistic Regression Using SAS: Theory and Application, Second Edition
Ebook
Logistic Regression Using SAS: Theory and Application, Second Edition
byPaul D. Allison
Rating: 4 out of 5 stars
4/5
SAS Macro Programming Made Easy, Third Edition
Ebook
SAS Macro Programming Made Easy, Third Edition
byMichele M. Burlew
Rating: 3 out of 5 stars
3/5
Practical and Efficient SAS Programming: The Insider's Guide
Ebook
Practical and Efficient SAS Programming: The Insider's Guide
byMartha Messineo
Rating: 0 out of 5 stars
0 ratings
Categorical Data Analysis Using SAS, Third Edition
Ebook
Categorical Data Analysis Using SAS, Third Edition
byMaura E. Stokes
Rating: 0 out of 5 stars
0 ratings
End-to-End Data Science with SAS: A Hands-On Programming Guide
Ebook
End-to-End Data Science with SAS: A Hands-On Programming Guide
byJames Gearheart
Rating: 0 out of 5 stars
0 ratings
Learning SAS by Example: A Programmer's Guide, Second Edition
Ebook
Learning SAS by Example: A Programmer's Guide, Second Edition
byRon Cody
Rating: 3 out of 5 stars
3/5
Exercises and Projects for The Little SAS Book, Sixth Edition
Ebook
Exercises and Projects for The Little SAS Book, Sixth Edition
byRebecca A. Ottesen
Rating: 0 out of 5 stars
0 ratings
Cody's Data Cleaning Techniques Using SAS, Third Edition
Ebook
Cody's Data Cleaning Techniques Using SAS, Third Edition
byRon Cody
Rating: 5 out of 5 stars
5/5
PROC SQL: Beyond the Basics Using SAS, Third Edition
Ebook
PROC SQL: Beyond the Basics Using SAS, Third Edition
byKirk Paul Lafler
Rating: 0 out of 5 stars
0 ratings
SAS Programming in the Pharmaceutical Industry, Second Edition
Ebook
SAS Programming in the Pharmaceutical Industry, Second Edition
byJack Shostak
Rating: 5 out of 5 stars
5/5
Biostatistics and Computer-based Analysis of Health Data Using SAS
Ebook
Biostatistics and Computer-based Analysis of Health Data Using SAS
byChristophe Lalanne
Rating: 0 out of 5 stars
0 ratings
The Little SAS Book: A Primer, Sixth Edition
Ebook
The Little SAS Book: A Primer, Sixth Edition
byLora D. Delwiche
Rating: 5 out of 5 stars
5/5
Data Quality for Analytics Using SAS
Ebook
Data Quality for Analytics Using SAS
byGerhard Svolba
Rating: 4 out of 5 stars
4/5
Deep Learning for Numerical Applications with SAS
Ebook
Deep Learning for Numerical Applications with SAS
byHenry Bequet
Rating: 0 out of 5 stars
0 ratings
Applied Data Mining for Forecasting Using SAS
Ebook
Applied Data Mining for Forecasting Using SAS
byTim Rey
Rating: 0 out of 5 stars
0 ratings
Simulating Data with SAS
Ebook
Simulating Data with SAS
byRick Wicklin
Rating: 0 out of 5 stars
0 ratings
Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS
Ebook
Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS
byDr. Goutam Chakraborty
Rating: 0 out of 5 stars
0 ratings
SAS for Forecasting Time Series, Third Edition
Ebook
SAS for Forecasting Time Series, Third Edition
byJohn C. Brocklebank, Ph.D.
Rating: 0 out of 5 stars
0 ratings
Carpenter's Guide to Innovative SAS Techniques
Ebook
Carpenter's Guide to Innovative SAS Techniques
byArt Carpenter
Rating: 0 out of 5 stars
0 ratings
The Data Detective's Toolkit: Cutting-Edge Techniques and SAS Macros to Clean, Prepare, and Manage Data
Ebook
The Data Detective's Toolkit: Cutting-Edge Techniques and SAS Macros to Clean, Prepare, and Manage Data
byKim Chantala
Rating: 0 out of 5 stars
0 ratings
Applying Data Science: Business Case Studies Using SAS
Ebook
Applying Data Science: Business Case Studies Using SAS
byGerhard Svolba
Rating: 0 out of 5 stars
0 ratings
An Introduction to SAS Visual Analytics: How to Explore Numbers, Design Reports, and Gain Insight into Your Data
Ebook
An Introduction to SAS Visual Analytics: How to Explore Numbers, Design Reports, and Gain Insight into Your Data
byTricia Aanderud
Rating: 5 out of 5 stars
5/5

Applications & Software For You

Skip carousel

Logic Pro X For Dummies
Ebook
Logic Pro X For Dummies
byGraham English
Rating: 0 out of 5 stars
0 ratings
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 0 out of 5 stars
0 ratings
Hacks for TikTok: 150 Tips and Tricks for Editing and Posting Videos, Getting Likes, Keeping Your Fans Happy, and Making Money
Ebook
Hacks for TikTok: 150 Tips and Tricks for Editing and Posting Videos, Getting Likes, Keeping Your Fans Happy, and Making Money
byKyle Brach
Rating: 5 out of 5 stars
5/5
Adobe Illustrator: A Complete Course and Compendium of Features
Ebook
Adobe Illustrator: A Complete Course and Compendium of Features
byJason Hoppe
Rating: 0 out of 5 stars
0 ratings
GarageBand For Dummies
Ebook
GarageBand For Dummies
byBob LeVitus
Rating: 5 out of 5 stars
5/5
Blender 3D Basics Beginner's Guide Second Edition
Ebook
Blender 3D Basics Beginner's Guide Second Edition
byGordon Fisher
Rating: 5 out of 5 stars
5/5
Synthesizer Cookbook: How to Use Filters: Sound Design for Beginners, #2
Ebook
Synthesizer Cookbook: How to Use Filters: Sound Design for Beginners, #2
byScreech House
Rating: 3 out of 5 stars
3/5
Sound Design for Filmmakers: Film School Sound
Ebook
Sound Design for Filmmakers: Film School Sound
byMurray Stiller
Rating: 5 out of 5 stars
5/5
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
Ebook
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
byGwendolyn Faraday
Rating: 5 out of 5 stars
5/5
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
Ebook
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
byAlex Parkinson
Rating: 4 out of 5 stars
4/5
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
Ebook
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
byKevin Clark
Rating: 5 out of 5 stars
5/5
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
Ebook
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
byDavid Mayer
Rating: 0 out of 5 stars
0 ratings
Adobe Photoshop: A Complete Course and Compendium of Features
Ebook
Adobe Photoshop: A Complete Course and Compendium of Features
byStephen Laskevitch
Rating: 5 out of 5 stars
5/5
The Unofficial Guide to Open Broadcaster Software: OBS: The World's Most Popular Free Live-Streaming Application
Ebook
The Unofficial Guide to Open Broadcaster Software: OBS: The World's Most Popular Free Live-Streaming Application
byPaul Richards
Rating: 0 out of 5 stars
0 ratings
Hilarious Jokes for Minecrafters: Mobs, Creepers, Skeletons, and More
Ebook
Hilarious Jokes for Minecrafters: Mobs, Creepers, Skeletons, and More
byMichele C. Hollow
Rating: 1 out of 5 stars
1/5
Start Your Own Podcast Business: Your Step-By-Step Guide to Success
Ebook
Start Your Own Podcast Business: Your Step-By-Step Guide to Success
byThe Staff of Entrepreneur Media
Rating: 5 out of 5 stars
5/5
iPhone Photography For Dummies
Ebook
iPhone Photography For Dummies
byMark Hemmings
Rating: 0 out of 5 stars
0 ratings
The Little SAS Book: A Primer, Sixth Edition
Ebook
The Little SAS Book: A Primer, Sixth Edition
byLora D. Delwiche
Rating: 5 out of 5 stars
5/5
Affinity Photo How To
Ebook
Affinity Photo How To
byRobin Whalley
Rating: 0 out of 5 stars
0 ratings
Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data
Ebook
Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data
byEMC Education Services
Rating: 0 out of 5 stars
0 ratings
How Do I Do That In InDesign?
Ebook
How Do I Do That In InDesign?
byDave Clayton
Rating: 5 out of 5 stars
5/5
GarageBand Basics: The Complete Guide to GarageBand: Music
Ebook
GarageBand Basics: The Complete Guide to GarageBand: Music
byAventuras De Viaje
Rating: 0 out of 5 stars
0 ratings
Mastering QuickBooks 2020: The ultimate guide to bookkeeping and QuickBooks Online
Ebook
Mastering QuickBooks 2020: The ultimate guide to bookkeeping and QuickBooks Online
byCrystalynn Shelton
Rating: 0 out of 5 stars
0 ratings
Experts' Guide to OneNote
Ebook
Experts' Guide to OneNote
byJeremy P. Jones
Rating: 5 out of 5 stars
5/5
Vocal Rescue: Rediscover the Beauty, Power and Freedom in Your Singing
Ebook
Vocal Rescue: Rediscover the Beauty, Power and Freedom in Your Singing
byLois Alba
Rating: 4 out of 5 stars
4/5
Canon EOS Rebel T3/1100D For Dummies
Ebook
Canon EOS Rebel T3/1100D For Dummies
byJulie Adair King
Rating: 5 out of 5 stars
5/5
Six Figure Blogging In 3 Months
Ebook
Six Figure Blogging In 3 Months
byShekhar Mishra
Rating: 4 out of 5 stars
4/5
Mastering ChatGPT
Ebook
Mastering ChatGPT
byCharles J. Jones
Rating: 0 out of 5 stars
0 ratings
Adobe InDesign CC: A Complete Course and Compendium of Features
Ebook
Adobe InDesign CC: A Complete Course and Compendium of Features
byStephen Laskevitch
Rating: 0 out of 5 stars
0 ratings
FL Studio Cookbook
Ebook
FL Studio Cookbook
byShaun Friedman
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

Data Observability - Barr Moses
Podcast episode
Data Observability - Barr Moses
byDataTalks.Club
0 ratings
0% found this document useful
Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary: Working with data is a complicated process, with numerous chances for something to go wrong. Identifying and accounting for those errors is a critical piece of building trust in the organization that your data is accurate and up to date. While there are numerous products available to provide that visibility, they all have different technologies and workflows that they focus on. To bring observability to dbt projects the team at Elementary embedded themselves into the workflow. In this episode Maayan Salom explores the approach that she has taken to bring observability, enhanced testing capabilities, and anomaly detection into every step of the dbt developer experience.
Podcast episode
Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary: Working with data is a complicated process, with numerous chances for something to go wrong. Identifying and accounting for those errors is a critical piece of building trust in the organization that your data is accurate and up to date. While there are numerous products available to provide that visibility, they all have different technologies and workflows that they focus on. To bring observability to dbt projects the team at Elementary embedded themselves into the workflow. In this episode Maayan Salom explores the approach that she has taken to bring observability, enhanced testing capabilities, and anomaly detection into every step of the dbt developer experience.
byData Engineering Podcast
0 ratings
0% found this document useful
Privacy-aware Data Pipelines with Skyflow’s Piper Keyes: A data analytics pipeline is important to modern businesses because it allows them to extract valuable insights from the large amounts of data they generate and collect on a daily basis. This leads to better decision making, improved efficiency, and ...
Podcast episode
Privacy-aware Data Pipelines with Skyflow’s Piper Keyes: A data analytics pipeline is important to modern businesses because it allows them to extract valuable insights from the large amounts of data they generate and collect on a daily basis. This leads to better decision making, improved efficiency, and ...
byPartially Redacted: Data Privacy, Security & Compliance
0 ratings
0% found this document useful
MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2
Podcast episode
MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2
byMLOps.community
0 ratings
0% found this document useful
BAM 068: Tagging, Data Models, and Data Normalization: Lately, there have been a ton of questions about data tagging and data normalization. But what does that actually mean? This is a super important topic especially when you consider that having common data formats is required to implement analytics,...
Podcast episode
BAM 068: Tagging, Data Models, and Data Normalization: Lately, there have been a ton of questions about data tagging and data normalization. But what does that actually mean? This is a super important topic especially when you consider that having common data formats is required to implement analytics,...
byThe Smart Buildings Academy Podcast | Teaching You Building Automation, Systems Integration, and Information Technology
0 ratings
0% found this document useful
Setting the Standard: Impact of Method Standardization in Chromatography
Podcast episode
Setting the Standard: Impact of Method Standardization in Chromatography
byThe Analytical Wavelength
0 ratings
0% found this document useful
[Bite] Documenting Data Science Projects
Podcast episode
[Bite] Documenting Data Science Projects
byDataCafé
0 ratings
0% found this document useful
051: Strategy evaluation techniques, flaws and solutions with Dave Walton: Today we’re covering a topic which can really be a concern for traders of all levels, from beginner to pro, and that is the topic of strategy evaluation. Have you ever found that real-life performance does not match expected results? Or perhaps you...
Podcast episode
051: Strategy evaluation techniques, flaws and solutions with Dave Walton: Today we’re covering a topic which can really be a concern for traders of all levels, from beginner to pro, and that is the topic of strategy evaluation. Have you ever found that real-life performance does not match expected results? Or perhaps you...
byBetter System Trader
0 ratings
0% found this document useful
Distributing Geospatial Data: Distributing Geospatial Data - Every wondered why you might what to do this? Or maybe you understand the why but are unsure about the how? Perhaps you have heard people talk about partitioning data or sharding data, you might have heard some of thes...
Podcast episode
Distributing Geospatial Data: Distributing Geospatial Data - Every wondered why you might what to do this? Or maybe you understand the why but are unsure about the how? Perhaps you have heard people talk about partitioning data or sharding data, you might have heard some of thes...
byThe MapScaping Podcast - GIS, Geospatial, Remote Sensing, earth observation and digital geography
0 ratings
0% found this document useful
Automate Your Pipeline Creation For Streaming Data Transformations With SQLake: Managing end-to-end data flows becomes complex and unwieldy as the scale of data and its variety of applications in an organization grows. Part of this complexity is due to the transformation and orchestration of data living in disparate systems. The team at Upsolver is taking aim at this problem with the latest iteration of their platform in the form of SQLake. In this episode Ori Rafael explains how they are automating the creation and scheduling of orchestration flows and their related transforations in a unified SQL interface.
Podcast episode
Automate Your Pipeline Creation For Streaming Data Transformations With SQLake: Managing end-to-end data flows becomes complex and unwieldy as the scale of data and its variety of applications in an organization grows. Part of this complexity is due to the transformation and orchestration of data living in disparate systems. The team at Upsolver is taking aim at this problem with the latest iteration of their platform in the form of SQLake. In this episode Ori Rafael explains how they are automating the creation and scheduling of orchestration flows and their related transforations in a unified SQL interface.
byData Engineering Podcast
0 ratings
0% found this document useful
Reconciling The Data In Your Databases With Datafold: A significant portion of data workflows involve storing and processing information in database engines. Validating that the information is stored and processed correctly can be complex and time-consuming, especially when the source and destination speak different dialects of SQL. In this episode Gleb Mezhanskiy, founder and CEO of Datafold, discusses the different error conditions and solutions that you need to know about to ensure the accuracy of your data.
Podcast episode
Reconciling The Data In Your Databases With Datafold: A significant portion of data workflows involve storing and processing information in database engines. Validating that the information is stored and processed correctly can be complex and time-consuming, especially when the source and destination speak different dialects of SQL. In this episode Gleb Mezhanskiy, founder and CEO of Datafold, discusses the different error conditions and solutions that you need to know about to ensure the accuracy of your data.
byData Engineering Podcast
0 ratings
0% found this document useful
Ep. 65 - Data Modeling
Podcast episode
Ep. 65 - Data Modeling
byWhat's Your Baseline? Enterprise Architecture & Business Process Management Demystified
0 ratings
0% found this document useful
Special Friday Episode: Practice test question with PCS Advantage!
Podcast episode
Special Friday Episode: Practice test question with PCS Advantage!
byPushing Pediatrics
0 ratings
0% found this document useful
Episode 41: Outcome Measures
Podcast episode
Episode 41: Outcome Measures
byPushing Pediatrics
0 ratings
0% found this document useful
Build Better Tests For Your dbt Projects With Datafold And data-diff: Data engineering is all about building workflows, pipelines, systems, and interfaces to provide stable and reliable data. Your data can be stable and wrong, but then it isn't reliable. Confidence in your data is achieved through constant validation and testing. Datafold has invested a lot of time into integrating with the workflow of dbt projects to add early verification that the changes you are making are correct. In this episode Gleb Mezhanskiy shares some valuable advice and insights into how you can build reliable and well-tested data assets with dbt and data-diff.
Podcast episode
Build Better Tests For Your dbt Projects With Datafold And data-diff: Data engineering is all about building workflows, pipelines, systems, and interfaces to provide stable and reliable data. Your data can be stable and wrong, but then it isn't reliable. Confidence in your data is achieved through constant validation and testing. Datafold has invested a lot of time into integrating with the workflow of dbt projects to add early verification that the changes you are making are correct. In this episode Gleb Mezhanskiy shares some valuable advice and insights into how you can build reliable and well-tested data assets with dbt and data-diff.
byData Engineering Podcast
0 ratings
0% found this document useful
DataFramed Careers Series Special Announcement!
Podcast episode
DataFramed Careers Series Special Announcement!
byDataFramed
0 ratings
0% found this document useful
10: Test Case Design using Given-When-Then from BDD: It doesn’t matter if you are using pytest, unittest, nose, or something completely different, this episode will help you write better tests.
Podcast episode
10: Test Case Design using Given-When-Then from BDD: It doesn’t matter if you are using pytest, unittest, nose, or something completely different, this episode will help you write better tests.
byTest and Code
0 ratings
0% found this document useful
Hyperspectral vs Multispectral: When comparing multispectral and hyperspectral data it is not simply a case of “more data more better”! With hyperspectral you have “The curse of Dimensionality” but you also get more flexibility to pick exactly what bands you want to use! With mult...
Podcast episode
Hyperspectral vs Multispectral: When comparing multispectral and hyperspectral data it is not simply a case of “more data more better”! With hyperspectral you have “The curse of Dimensionality” but you also get more flexibility to pick exactly what bands you want to use! With mult...
byThe MapScaping Podcast - GIS, Geospatial, Remote Sensing, earth observation and digital geography
0 ratings
0% found this document useful
Unpacking The Seven Principles Of Modern Data Pipelines: Data pipelines are the core of every data product, ML model, and business intelligence dashboard. If you're not careful you will end up spending all of your time on maintenance and fire-fighting. The folks at Rivery distilled the seven principles of modern data pipelines that will help you stay out of trouble and be productive with your data. In this episode Ariel Pohoryles explains what they are and how they work together to increase your chances of success.
Podcast episode
Unpacking The Seven Principles Of Modern Data Pipelines: Data pipelines are the core of every data product, ML model, and business intelligence dashboard. If you're not careful you will end up spending all of your time on maintenance and fire-fighting. The folks at Rivery distilled the seven principles of modern data pipelines that will help you stay out of trouble and be productive with your data. In this episode Ariel Pohoryles explains what they are and how they work together to increase your chances of success.
byData Engineering Podcast
0 ratings
0% found this document useful
Defining A Strategy For Your Data Products: The primary application of data has moved beyond analytics. With the broader audience comes the need to present data in a more approachable format. This has led to the broad adoption of data products being the delivery mechanism for information. In this episode Ranjith Raghunath shares his thoughts on how to build a strategy for the development, delivery, and evolution of data products.
Podcast episode
Defining A Strategy For Your Data Products: The primary application of data has moved beyond analytics. With the broader audience comes the need to present data in a more approachable format. This has led to the broad adoption of data products being the delivery mechanism for information. In this episode Ranjith Raghunath shares his thoughts on how to build a strategy for the development, delivery, and evolution of data products.
byData Engineering Podcast
0 ratings
0% found this document useful
An Overview Of The Sate Of Data Orchestration In An Increasingly Complex Data Ecosystem: Data systems are inherently complex and often require integration of multiple technologies. Orchestrators are centralized utilities that control the execution and sequencing of interdependent operations. This offers a single location for managing visibility and error handling so that data platform engineers can manage complexity. In this episode Nick Schrock, creator of Dagster, shares his perspective on the state of data orchestration technology and its application to help inform its implementation in your environment.
Podcast episode
An Overview Of The Sate Of Data Orchestration In An Increasingly Complex Data Ecosystem: Data systems are inherently complex and often require integration of multiple technologies. Orchestrators are centralized utilities that control the execution and sequencing of interdependent operations. This offers a single location for managing visibility and error handling so that data platform engineers can manage complexity. In this episode Nick Schrock, creator of Dagster, shares his perspective on the state of data orchestration technology and its application to help inform its implementation in your environment.
byData Engineering Podcast
0 ratings
0% found this document useful
Data Sharing Across Business And Platform Boundaries: Sharing data is a simple concept, but complicated to implement well. There are numerous business rules and regulatory concerns that need to be applied. There are also numerous technical considerations to be made, particularly if the producer and consumer of the data aren't using the same platforms. In this episode Andrew Jefferson explains the complexities of building a robust system for data sharing, the techno-social considerations, and how the Bobsled platform that he is building aims to simplify the process.
Podcast episode
Data Sharing Across Business And Platform Boundaries: Sharing data is a simple concept, but complicated to implement well. There are numerous business rules and regulatory concerns that need to be applied. There are also numerous technical considerations to be made, particularly if the producer and consumer of the data aren't using the same platforms. In this episode Andrew Jefferson explains the complexities of building a robust system for data sharing, the techno-social considerations, and how the Bobsled platform that he is building aims to simplify the process.
byData Engineering Podcast
0 ratings
0% found this document useful
Building An Internal Database As A Service Platform At Cloudflare: Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale.
Podcast episode
Building An Internal Database As A Service Platform At Cloudflare: Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale.
byData Engineering Podcast
0 ratings
0% found this document useful
Keeping ourselves honest when we work with observational healthcare data: The abundance of data in healthcare, and the valu…
Podcast episode
Keeping ourselves honest when we work with observational healthcare data: The abundance of data in healthcare, and the valu…
byLinear Digressions
0 ratings
0% found this document useful
Episode 12.1 Fact Sheet Friday
Podcast episode
Episode 12.1 Fact Sheet Friday
byPushing Pediatrics
0 ratings
0% found this document useful
What is Facilitated Communication? Session 199 with Jason Travers: If your social media consumption is anything like mine, you've likely seen some as of late that report on non-speaking students - generally students with Autism - who are graduating from college, giving valedictorian speeches, and so...
Podcast episode
What is Facilitated Communication? Session 199 with Jason Travers: If your social media consumption is anything like mine, you've likely seen some as of late that report on non-speaking students - generally students with Autism - who are graduating from college, giving valedictorian speeches, and so...
byThe Behavioral Observations Podcast with Matt Cicoria
0 ratings
0% found this document useful
278: Beliefs in the Firmware: In this week's episode, Steph and Chris discuss the popular testing themes and questions that emerged during the RSpec training course, reflecting on which testing "rules" still apply and when to break the rules. They also chat about the results of the 2020 State of JS survey and repurposing email validations to be helpful vs strict.
Podcast episode
278: Beliefs in the Firmware: In this week's episode, Steph and Chris discuss the popular testing themes and questions that emerged during the RSpec training course, reflecting on which testing "rules" still apply and when to break the rules. They also chat about the results of the 2020 State of JS survey and repurposing email validations to be helpful vs strict.
byThe Bike Shed
0 ratings
0% found this document useful
Justin Dux - Be The Match: Sign-up for The Technopath Way Weekly Newsletter here: technopath.ac-page.com/the-technopath-way-sign-up Be The Match How can I check if I’m still registered in the database if I signed up years ago? You can call this number to find out: 1 (800)...
Podcast episode
Justin Dux - Be The Match: Sign-up for The Technopath Way Weekly Newsletter here: technopath.ac-page.com/the-technopath-way-sign-up Be The Match How can I check if I’m still registered in the database if I signed up years ago? You can call this number to find out: 1 (800)...
byThe Technopath Way: Productivity through tech for nonprofits
0 ratings
0% found this document useful
Eliminate The Overhead In Your Data Integration With The Open Source dlt Library: Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of multiple options for flexible data integration, with a roughly equal distribution of commercial and open source options. The challenge is that most of those options are complex to operate and exist in their own silo. The dlt project was created to eliminate overhead and bring data integration into your full control as a library component of your overall data system. In this episode Adrian Brudaru explains how it works, the benefits that it provides over other data integration solutions, and how you can start building pipelines today.
Podcast episode
Eliminate The Overhead In Your Data Integration With The Open Source dlt Library: Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of multiple options for flexible data integration, with a roughly equal distribution of commercial and open source options. The challenge is that most of those options are complex to operate and exist in their own silo. The dlt project was created to eliminate overhead and bring data integration into your full control as a library component of your overall data system. In this episode Adrian Brudaru explains how it works, the benefits that it provides over other data integration solutions, and how you can start building pipelines today.
byData Engineering Podcast
0 ratings
0% found this document useful
Putting machine learning into a database: Most data scientists bounce back and forth regula…
Podcast episode
Putting machine learning into a database: Most data scientists bounce back and forth regula…
byLinear Digressions
0 ratings
0% found this document useful

Skip carousel

Opinion: Two Words To Help Ned Sharpless Revolutionize Clinical Trials: Data Standards
STAT
Article
Opinion: Two Words To Help Ned Sharpless Revolutionize Clinical Trials: Data Standards
May 13, 2019
4 min read
Fortune Telling…
Linux Format
Article
Fortune Telling…
Dec 14, 2021
Matt Yonkovit Head of Open Source Strategy and silly hats, Percona “Accurately predicting trends can get you ahead of the competition. Next year I predict sadly we’ll see more of the same. I predict we’ll continue to see more security breaches due to
1 min read
How Mature Is Your Organisation With Regards To Digital And Web Analytics?
NZ Marketing
Article
How Mature Is Your Organisation With Regards To Digital And Web Analytics?
Jun 9, 2021
1 min read
Saxo Bank And Thoughtworks: Enabling Data Democratization At A Global Investment Bank
Business Today
Article
Saxo Bank And Thoughtworks: Enabling Data Democratization At A Global Investment Bank
Jan 20, 2023
2 min read
MARIADB Optimise And Control Your Databases
Linux Format
Article
MARIADB Optimise And Control Your Databases
Jul 30, 2019
9 min read
CalicoPie Family Historian 7
Computeractive
Article
CalicoPie Family Historian 7
Mar 24, 2021
SOFTWARE | £60 from Family Historian Store www.snipca.com/37615 If you’ve ever researched your family tree, you’ll know it’s much harder than the BBC’s celebrity genealogy programme Who Do You Think You Are? makes it appear. You’ll certainly need to
2 min read
Powering Costing With Artificial Intelligence: The Case Of Vodafone Procurement
The European Business Review
Article
Powering Costing With Artificial Intelligence: The Case Of Vodafone Procurement
May 25, 2021
8 min read
Facilities Systems
Facility Management
Article
Facilities Systems
Oct 21, 2018
5 min read
Manipulate Data Like A Pro With Pandas
Linux Format
Article
Manipulate Data Like A Pro With Pandas
Jul 27, 2021
7 min read
Pandas And Data
Linux Format
Article
Pandas And Data
Jul 27, 2021
Pandas can perform a variety of tasks including data loading, preparation and manipulation as well as data modelling and analysis. You can join, merge and reshape data with the help of Pandas, using data from different sources. As mentioned elsewhere
1 min read
Putting Artificial Intelligence to Work
Rotman Management
Article
Putting Artificial Intelligence to Work
May 1, 2018
11 min read
Pivoting To First-party Data
NZ Marketing
Article
Pivoting To First-party Data
Jun 9, 2021
5 min read
Better Together: Behavioural Science + Data Science
Rotman Management
Article
Better Together: Behavioural Science + Data Science
May 1, 2020
IMAGINE THIS SCENARIO: You are designing a new customer experience to drive a shift in customer behaviour. You have reviewed the reports and dashboards describing current behaviour. You have asked customers how they felt and incorporated their feedba
5 min read
Mac 911
MacWorld
Article
Mac 911
Apr 20, 2021
7 min read
How And Where You Use Machine-learning
APC
Article
How And Where You Use Machine-learning
Oct 7, 2019
4 min read
5 QUESTIONS with: Diahan Southard -DNA Expert
Family Tree
Article
5 QUESTIONS with: Diahan Southard -DNA Expert
Nov 27, 2023
2 min read
“What You See Is A Mirage, As Tuck-up Picture That Doesn’t Describe What’s Happening To Your Packets”
PC Pro Magazine
Article
“What You See Is A Mirage, As Tuck-up Picture That Doesn’t Describe What’s Happening To Your Packets”
Oct 8, 2020
6 min read
Federated Learning Uses The Data Right On Our Devices
Futurity
Article
Federated Learning Uses The Data Right On Our Devices
Jul 21, 2022
2 min read
Web App Security
Linux Format
Article
Web App Security
Jun 29, 2021
8 min read
LIKE A PRO… Master Of Technology
Cycling Plus
Article
LIKE A PRO… Master Of Technology
Aug 5, 2020
2 min read
Q&A
Rotman Management
Article
Q&A
May 1, 2023
Describe the capability that companies like Netflix, UPS, Amazon and Caesars Entertainment have in common. These are all leading firms in their industries with respect to leveraging analytics as a source of competitive advantage. We now have so much
7 min read
Business NAS appliances 2022
PC Pro Magazine
Article
Business NAS appliances 2022
Apr 10, 2022
4 min read
Observe The Observers
Linux Format
Article
Observe The Observers
Sep 22, 2020
“Observability comes from control theory and means that you should be able to determine or infer a system’s state by its output. For some setups, this can involve using strace and grepping through logs, but the number of machines that DBA’s are deali
1 min read
“How Do You Launch A Product Without Alienating Or Damaging Your Customers?”
PC Pro Magazine
Article
“How Do You Launch A Product Without Alienating Or Damaging Your Customers?”
Feb 10, 2022
6 min read
Machine Learning – With Zero Programming
APC
Article
Machine Learning – With Zero Programming
Aug 12, 2019
6 min read
Google Answer Box Strategy
Techfastly
Article
Google Answer Box Strategy
Sep 21, 2020
Leveraging the Google PAA (People Also Ask) element on a Search Results Page for Targeted Content Creation with a Python Scraper All businesses that are online today are creating content at a furious pace. According to Technavio, a research firm, con
7 min read
Integrated Workplace Management Systems
Facility Management
Article
Integrated Workplace Management Systems
Dec 23, 2018
Property and facilities management are data-rich operating worlds. This is becoming even more complex as the Internet of Things (IoT) provides the capability to imbed sensors and diagnostic tools to monitor the use and performance of everything in re
4 min read
Data Entry
Racecar Engineering
Article
Data Entry
Jun 7, 2019
14 min read
End Of The Line!
Linux Format
Article
End Of The Line!
Nov 15, 2022
"October 2023 may seem like a long time away, but that’s when MySQL 5.7 will hit end-of-life (EOL) status. This normally means no more updates or security patches will be released. For companies running this database in their applications, it is time
1 min read
Your Digital Family Tree Helpdesk
Family Tree UK
Article
Your Digital Family Tree Helpdesk
Mar 10, 2020
4 min read

Related categories

Skip carousel

Reviews for Multiple Imputation of Missing Data Using SAS

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Multiple Imputation of Missing Data Using SAS - Patricia Berglund

About This Book

Purpose

Multiple Imputation of Missing Data Using SAS provides both theoretical background and constructive solutions for those working with incomplete data sets in an engaging example-driven format. It offers practical instruction on the use of SAS for multiple imputation and provides numerous examples using a variety of public release data sets.

Is This Book for You?

Written for users with an intermediate background in SAS programming and statistics, this book is an excellent resource for anyone seeking guidance on multiple imputation. The authors cover PROC MI and PROC MIANALYZE in detail along with other procedures used for analysis of complete data sets. They guide analysts through the multiple imputation process, including evaluation of missing data patterns, choice of an imputation method, execution of the process, and interpretation of results.

Prerequisites

An intermediate level of SAS programming and statistics is recommended.

Scope of This Book

The authors cover PROC MI and PROC MIANALYZE in detail along with other procedures used for analysis of complete data sets. They guide analysts through the multiple imputation process, including evaluation of missing data patterns, choice of an imputation method, execution of the process, and interpretation of results. Many applications using real-world data sets are presented.

About the Examples

Software Used to Develop the Book's Content

SAS v9.4 is used throughout the book. PROC MI and PROC MIANALYZE are highlighted along with use of a variety of standard and SAS SURVEY procedures.

Example Code and Data

Example code and data are available from the book website. You can access the example code and data for this book by linking to the Berglund author page at http://support.sas.com/publishing/authors.

Select the name of the author. Then, look for the cover thumbnail of this book, and select Example Code and Data to display the SAS programs that are included in this book.

For an alphabetical listing of all books for which example code and data is available, see http://support.sas.com/bookcode. Select a title to display the book's example code.

If you are unable to access the code through the website, e-mail saspress@sas.com.

Output and Graphics Used in This Book

All output and graphics were created using SAS v9.4.

Additional Resources

Applied Survey Data Analysis and website: http://www.isr.umich.edu/src/smp/asda/.

IVEware software download and documentation is available at: iveware.org.

Keep in Touch

We look forward to hearing from you. We invite questions, comments, and concerns. If you want to contact us about a specific book, please include the book title in your correspondence to saspress@sas.com.

To Contact the Author through SAS Press

By e-mail: saspress@sas.com

Via the Web: http://support.sas.com/author_feedback

SAS Books

For a complete list of books available through SAS, visit http://support.sas.com/bookstore.

Phone: 1-800-727-3228

Fax: 1-919-677-8166

E-mail: sasbook@sas.com

SAS Book Report

Receive up-to-date information about all new SAS publications via e-mail by subscribing to the SAS Book Report monthly eNewsletter. Visit http://support.sas.com/sbr.

Publish with SAS

SAS is recruiting authors! Are you interested in writing a book? Visit http://support.sas.com/saspress for more information.

About The Authors

Patricia Berglund is a Senior Research Associate in the Survey Methodology Program at the University of Michigan Institute for Social Research (ISR). She has extensive experience in the use of SAS for data management and analysis. She is a faculty member in the ISR's Summer Institute in Survey Research Techniques and also directs the ISR's SAS training programs. Berglund also teaches a SAS Business Knowledge Series class titled Imputation Techniques in SAS. Her primary research interests are mental health, youth substance issues, and survey methodology.

Steven Heeringa is a Senior Research Scientist at the University of Michigan Institute for Social Research (ISR) where he is Director of the Statistical Design Group. He is a member of the Faculty of the University of Michigan Program in Survey Methods and the Joint Program in Survey Methodology at the University of Maryland. Heeringa is a Fellow of the American Statistical Association and elected member of the International Statistical Institute. He is the author of many publications on statistical design and sampling methods for research in the fields of public health and the social sciences. Heeringa has over 35 years of statistical sampling experience in the development of the ISR's National Sample design, as well as research designs for the ISR's major longitudinal and cross-sectional survey programs.

Learn more about these authors by visiting their author pages, where you can download free book excerpts, access example code and data, read the latest reviews, get updates, and more: http://support.sas.com/berglund http://support.sas.com/heeringa

Acknowledgements

We gratefully acknowledge the anonymous technical reviewers along with editors and production staff at SAS Institute who made many excellent suggestions. Their help made this a much better book overall.

Special thanks go to our colleagues Dr. Trivellore Raghunathan and Dr. Brady West for their work and leadership in the area of multiple imputation.

Additional thanks to SAS employees Stacey Hamilton, Patsy Poole, Katie Whitley, Rob Agnelli, and Dr. Catherine Truxillo who each supported and encouraged our work on this book.

Chapter 1: Introduction to Missing Data and Methods for Analyzing Data with Missing Values

1.1 Introduction

1.2 Sources and Patterns of Item Missing Data

1.3 Item Missing Data Mechanisms

1.4 Review of Strategies to Address Item Missing Data

1.4.1 Complete Case Analysis

1.4.2 Complete Case Analysis with Weighting Adjustments

1.4.3 Full Information Maximum Likelihood

1.4.4 Expectation-Maximization Algorithm.

1.4.5 Single Imputation of Missing Values

1.4.6 Multiple Imputation

1.5 Outline of Book Chapters

1.6 Overview of Analysis Examples

1.1 Introduction

Over the past half-century, statistical analysts have employed a wide range of techniques to address the theoretical and practical question of what do I do about missing values? These techniques have ranged from a simple default of dropping observations with missing data values from analysis (the list-wise deletion default in SAS and most other major software systems) to highly sophisticated methods for modeling the missing data mechanism in order to derive imputed values or to conduct a complex maximum likelihood analysis. There is no single approach that is optimal for all missing data problems—either in theory or in practice. Fortunately for the data analyst, SAS and other major statistical analysis software packages now provide their users with robust procedures tailored to address differing problems of missing data. Multiple Imputation of Missing Data Using SAS is written to serve as a practical guide for those dealing with general missing data problems in fields such as the social, biological, and physical sciences; medical and public health research; education; business; and many other scientific and professional disciplines. Central to this book is the method of multiple imputation (MI) for item missing data. Supported by the SAS PROC MI and PROC MIANALYZE procedures, MI is based on a powerful set of methods for both filling in the missing values in the user data set and for performing robust statistical estimation and inference using the completed data sets.

The combination of basic theoretical background and extensive practical applications using SAS (v9.4) presented in this volume provides a solid foundation for understanding and resolving missing data problems using the multiple imputation method. The applications presented in Chapters 4 through 8 address a number of common missing data problems and imputation approaches using PROC MI and PROC MIANALYZE along with various descriptive and inferential tools for analysis of complete data sets. All examples stress the three-step process of multiple imputation: 1) selection of an appropriate data model for the imputation and application of an appropriate imputation method using PROC MI; 2) analysis of complete data sets using standard or SURVEY procedures; and 3) synthesis of analytic results for statistical inference using PROC MIANALYZE.

1.2 Sources and Patterns of Item Missing Data

Missing data takes many forms and can be attributed to many causes. For example, in data derived from surveys, item missing data occurs when a respondent elects not to answer certain questions, resulting in only a don't know or refused response. The failure to respond can be driven by the desire to elude answering questions that are sensitive in nature or perceived to be intrusive (e.g., illegal behavior, income-related, or medical history questions) or by a simple lack of knowledge required to answer the questions (e.g., expenditure on durable goods in the past 30 days). More generally in observational and experimental research, missing observations can arise due to missed clinical appointments, equipment failures, or other circumstances that disrupt the intended measurement. For example, a power outage that deactivates environmental data collection instrumentation for a period of time results in a missing data problem. In some research settings such as epidemiology (Schenker et al. 2010) and genetics (Bobb et al. 2011), missing data is actually better labeled nonobserved data and powerful new statistical tools including MI are used to impute the nonobserved data based on strong associations with variables that were observed in the same or a different study.

Missing data problems can be classified based on a notation and taxonomy employed by Little and Rubin (2002). In regard to the notation, the data of analytic interest are the true underlying values for an n x p matrix, Y, consisting of i=1,…,n rows (sample cases) and j=1,… ,p variables, Yi = {Yi1 ,…,Yip} for each case. This underlying set of true values for the variables of interest is then decomposed into two subvectors of variables, Y = {Yobs, Ymis}, where Yobs are the set of variable values that are observed and Ymis are the variable values that are missing and must be imputed (Heeringa, West, and Berglund 2010). The statistical properties of individual variables and their relationships in the underlying data are governed by a distributional model, f(y|θ). Our analytic interest lies in making inference about the parameters θ or functions of these parameters. Maximum likelihood (ML) methods are often used in analysis, although corresponding Bayesian methods of analysis and inference may also be employed. In addition to the distributional model for the underlying data, the missing data problem also includes a second statistical model, g(m|ψ), that governs the stochastic process which determines whether the true value of a Yij is missing (Mij=1) or observed (Mij=0). Corresponding to the Y matrix of true data values is a second array of identical dimension, M, of indicator values that flag whether the corresponding element of Y is missing or observed.

To help us better understand our missing data problem and to choose an appropriate imputation strategy, statisticians have created a two-part taxonomy for the most common missing data problems. The first dimension is labeled the missing data pattern, which as the name implies defines the particular distribution of the missing observations (represented by the M matrix) across the data cases (i=1,…,n) and variables (j=1,…,p) that comprise our analytical data set.

The most common missing data pattern is termed generalized or arbitrary—where there is no particular pattern in the missing data structure. As illustrated by the cells with ? in the schematic array in Figure 1.1, missing observations are distributed across cases and variables in a nonsystematic fashion. The arbitrary missing data pattern illustrated in Figure 1.1 might be handled with an imputation using a Markov chain Monte Carlo (MCMC) or a fully conditional specification (FCS) method, both available in PROC MI.

Figure 1.1: Generalized Pattern of Missing Data

Some data collection processes produce a more structured or systematic pattern of missingness in the data. For example, in a medical study with multiple phases, missing data can occur when an entire phase of the data collection effort, such as a blood draw or obtaining medical records needed for follow-up, requires special consent of the subject. Without agreement from the study subject for participation in this type of data collection, there is missing data for all variables from the entire phase of the study. The result is a monotonic missing data pattern similar to that illustrated in Figure 1.2. Nonresponse to follow-up waves in a longitudinal survey may also produce a monotonic pattern of item missing data. Monotone patterns of missing data lend themselves to imputation methods that require simpler assumptions than approaches that are required for a general pattern of missing data and are efficiently handled in PROC MI. In fact, in Chapter 6 we will consider a two-step procedure that imputes small amounts of missing data in selected variables in order to transform a problem with a generalized missing data pattern to one that has a monotone pattern for key variables that suffer the highest rates of missing data. We note here that analysts always have the option to address monotone missing data patterns using procedures such as the FCS method that are applicable to the more general case of an arbitrary missing data pattern.

Figure 1.2: Monotone Missing Data Pattern

A third pattern of missing data arises in studies that incorporate randomization procedures to allow item missing data on selected variables for subsets of study observations. The technique is termed matrix sampling or missing by design sampling (Thomas et al. 2006) and is often employed with modularized sets of data observations (e.g., physical tests or sets of survey questions). In a survey data collection that employs matrix sampling, designated subsets of core questions are asked of all respondents, with additional modules of more in-depth questions randomly assigned to subsamples of participants. Matrix sampling designs tend to produce a non-monotonic missing data structure such as that illustrated in Figure 1.3.

Figure 1.3 Matrix Sampling (Missing by Design)

For this type of missing data pattern, multiple imputation has typically been the primary tool to analyze these data (Raghunathan and Grizzle 1995). Use of PROC MI with an imputation method such as FCS that is suitable for an arbitrary missing data pattern can be employed. As with any imputation problem, the recommended imputation method depends on the pattern of missing data and the type of variables to be imputed.

1.3 Item Missing Data Mechanisms

The second dimension of statisticians’ two-part taxonomy of missing data problems is termed the missing data mechanism. Missing data for a single variable is classified into one of three categories: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). Missing data are MCAR if the probability that an item value is missing is completely random and does not depend on the missing values for a case, Ymis, nor does it depend on any of the observed variables for the case, Yobs. A more realistic assumption for item missing data that underlies the procedures covered in this book is that the data are missing at random (MAR). The MAR assumption requires that, conditional on the observed data for the case, Yobs, the probability that a value is missing does not depend on the true values of the missing items, Ymis. For example, the predictive distribution used to draw imputed values for Ymis may be a regression model in which the predictors are selected from Yobs. In practice, the MAR assumption may not strictly apply for all missing items. If the probability that a variable value is missing depends on the missing value and cannot be fully explained by the remaining observed variables, Yobs, the missing data mechanism is labeled missing not at random (MNAR). For example, if after accounting for observed age, gender, education, and marital status of the household head, the probability of item missing data on a measure of household income depends on the underlying dollar value, then the problem is MNAR.

Little and Rubin (2002) classify a missing data mechanism as ignorable for likelihood-based inference if two conditions hold: 1) the missing data are MAR (missingness does not depend on Ymis); and 2) the parameters of the data distribution, f(y|θ), are distinct from the parameters of the model for the missing data mechanism g(M|ψ). (For Bayesian inference, the second condition requires that the prior distributions for θ and ψ are independent.) From our perspective as analysts of data with missing values, the first of these two conditions is the more important. Likelihood inference remains valid, albeit statistically less efficient, if the parameter spaces of the data distribution and the missing data generating model are not distinct.

A commonly asked question is, Can the MAR assumption be tested in SAS? The current version of SAS/STAT software, SAS/STAT 13.1, implements sensitivity analysis for departures from the MAR assumption with the new MNAR statement in PROC MI. See the PROC MI documentation for details. In addition, Little (1988) presents a likelihood ratio test of the null hypothesis that the data are MCAR versus the alternative that they are MAR (conditional on a defined set of observed covariates). Although useful in some special cases, this test is not sufficient to establish that the missing data mechanism is ignorable. The econometric literature on selection bias (e.g., Amemiya, 1985; Heckman, 1976) presents several tests of the null hypothesis that the data are MAR versus the MNAR alternative. As Little (1985) notes, however, these

Enjoying the preview?

Page 1 of 1

Multiple Imputation of Missing Data Using SAS

About this ebook

Patricia Berglund

Related authors

Related to Multiple Imputation of Missing Data Using SAS

Related ebooks

Applications & Software For You

Related podcast episodes

Related articles

Related categories

Reviews for Multiple Imputation of Missing Data Using SAS

What did you think?

Book preview

Multiple Imputation of Missing Data Using SAS - Patricia Berglund

About This Book

Purpose

Is This Book for You?

Prerequisites

Scope of This Book

About the Examples

Software Used to Develop the Book's Content

Example Code and Data

Output and Graphics Used in This Book

Additional Resources

Keep in Touch

To Contact the Author through SAS Press

SAS Books

SAS Book Report

Publish with SAS

About The Authors

Acknowledgements

1.1 Introduction

1.2 Sources and Patterns of Item Missing Data

1.3 Item Missing Data Mechanisms