Biostatistics Using JMP: A Practical Guide
By Trevor Bihl
()
About this ebook
Trevor Bihl's Biostatistics Using JMP: A Practical Guide provides a practical introduction on using JMP, the interactive statistical discovery software, to solve biostatistical problems. Providing extensive breadth, from summary statistics to neural networks, this essential volume offers a comprehensive, step-by-step guide to using JMP to handle your data.
The first biostatistical book to focus on software, Biostatistics Using JMP discusses such topics as data visualization, data wrangling, data cleaning, histograms, box plots, Pareto plots, scatter plots, hypothesis tests, confidence intervals, analysis of variance, regression, curve fitting, clustering, classification, discriminant analysis, neural networks, decision trees, logistic regression, survival analysis, control charts, and metaanalysis.
Written for university students, professors, those who perform biological/biomedical experiments, laboratory managers, and research scientists, Biostatistics Using JMP provides a practical approach to using JMP to solve your biostatistical problems.
Trevor Bihl
Trevor Bihl is both a research scientist/engineer and an educator who teaches biostatistics, engineering statistics, and programming courses. He has been a SAS/JMP user since 2009 and provides various biostatistics and data mining consulting services. His background includes multivariate statistics, signal processing, data mining, and analytics. His educational background includes a BS and MS from Ohio University and a PhD from the Air Force Institute of Technology. He is the author of multiple journal and conference papers, book chapters, and technical reports.
Related to Biostatistics Using JMP
Related ebooks
Real World Health Care Data Analysis: Causal Methods and Implementation Using SAS Rating: 0 out of 5 stars0 ratingsBiostatistics by Example Using SAS Studio Rating: 0 out of 5 stars0 ratingsCarpenter's Guide to Innovative SAS Techniques Rating: 0 out of 5 stars0 ratingsApplied Data Mining for Forecasting Using SAS Rating: 0 out of 5 stars0 ratingsSAS Graphics for Clinical Trials by Example Rating: 5 out of 5 stars5/5Advanced SQL with SAS Rating: 0 out of 5 stars0 ratingsSAS Statistics by Example Rating: 5 out of 5 stars5/5An Introduction to Creating Standardized Clinical Trial Data with SAS Rating: 0 out of 5 stars0 ratingsJump into JMP Scripting, Second Edition Rating: 0 out of 5 stars0 ratingsInteractive Reports in SAS® Visual Analytics: Advanced Features and Customization Rating: 0 out of 5 stars0 ratingsClinical Graphs Using SAS Rating: 0 out of 5 stars0 ratingsGeneralized Linear and Nonlinear Models for Correlated Data: Theory and Applications Using SAS Rating: 0 out of 5 stars0 ratingsSAS Viya: The R Perspective Rating: 0 out of 5 stars0 ratingsSampling Rating: 5 out of 5 stars5/5JMP for Mixed Models Rating: 0 out of 5 stars0 ratingsA Course in Statistics with R Rating: 0 out of 5 stars0 ratingsSAS for Forecasting Time Series, Third Edition Rating: 0 out of 5 stars0 ratingsElementary Statistics Using SAS Rating: 0 out of 5 stars0 ratingsR and Data Mining: Examples and Case Studies Rating: 3 out of 5 stars3/5Categorical Data Analysis Using SAS, Third Edition Rating: 0 out of 5 stars0 ratingsLearning Predictive Analytics with R Rating: 0 out of 5 stars0 ratingsThe Little SAS Book: A Primer, Sixth Edition Rating: 5 out of 5 stars5/5Machine Learning with R, the tidyverse, and mlr Rating: 0 out of 5 stars0 ratingsIntroduction to Biostatistics with JMP (Hardcover edition) Rating: 1 out of 5 stars1/5Biostatistics Decoded Rating: 0 out of 5 stars0 ratingsWise Use of Null Hypothesis Tests: A Practitioner's Handbook Rating: 0 out of 5 stars0 ratingsApplications of Regression Models in Epidemiology Rating: 0 out of 5 stars0 ratingsModern Experimental Design Rating: 0 out of 5 stars0 ratingsJMP for Basic Univariate and Multivariate Statistics: Methods for Researchers and Social Scientists, Second Edition Rating: 0 out of 5 stars0 ratingsPatient Registry Data for Research: A Basic Practical Guide Rating: 0 out of 5 stars0 ratings
Enterprise Applications For You
Scrivener For Dummies Rating: 4 out of 5 stars4/5Learn Windows PowerShell in a Month of Lunches Rating: 0 out of 5 stars0 ratingsCreating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates Rating: 4 out of 5 stars4/5Bitcoin For Dummies Rating: 4 out of 5 stars4/5Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1 Rating: 5 out of 5 stars5/5ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology Rating: 0 out of 5 stars0 ratingsSharePoint 2016 For Dummies Rating: 5 out of 5 stars5/5Systems Thinking: Managing Chaos and Complexity: A Platform for Designing Business Architecture Rating: 4 out of 5 stars4/5Excel Formulas and Functions 2020: Excel Academy, #1 Rating: 4 out of 5 stars4/5QuickBooks 2023 All-in-One For Dummies Rating: 0 out of 5 stars0 ratingsExcel 2019 For Dummies Rating: 3 out of 5 stars3/550 Useful Excel Functions: Excel Essentials, #3 Rating: 5 out of 5 stars5/5Notion for Beginners: Notion for Work, Play, and Productivity Rating: 4 out of 5 stars4/5Enterprise AI For Dummies Rating: 3 out of 5 stars3/5Learning Python Rating: 5 out of 5 stars5/5Using Word 2019: The Step-by-step Guide to Using Microsoft Word 2019 Rating: 0 out of 5 stars0 ratingsQuickBooks 2021 For Dummies Rating: 0 out of 5 stars0 ratingsManaging Humans: Biting and Humorous Tales of a Software Engineering Manager Rating: 4 out of 5 stars4/5Excel Tips and Tricks Rating: 0 out of 5 stars0 ratingsMastering QuickBooks 2020: The ultimate guide to bookkeeping and QuickBooks Online Rating: 0 out of 5 stars0 ratingsExcel 2016 For Dummies Rating: 4 out of 5 stars4/5The New Email Revolution: Save Time, Make Money, and Write Emails People Actually Want to Read! Rating: 5 out of 5 stars5/5Essential Office 365 Third Edition: The Illustrated Guide to Using Microsoft Office Rating: 3 out of 5 stars3/5Experts' Guide to OneNote Rating: 5 out of 5 stars5/5The Basics of Hacking and Penetration Testing: Ethical Hacking and Penetration Testing Made Easy Rating: 4 out of 5 stars4/5
Reviews for Biostatistics Using JMP
0 ratings0 reviews
Book preview
Biostatistics Using JMP - Trevor Bihl
Chapter 1: Introduction
1.1 Background and Overview
1.2 Getting Started with JMP
1.3 General Outline
1.4 How to Use This Book
1.5 Reference
1.1 Background and Overview
This book evolved from personal experiences in both teaching and consulting in biostatistics. Although many biostatistics textbooks show computer outputs and results, they rarely show how to generate the results. Biostatistics instruction is also commonly theoretical and based on solving simple problems by hand. However, real-world data is usually more complicated than the simple examples, and I found that frequent collaborators–PhD-educated researchers who performed and managed experiments–needed more understanding of software to analyze their data. This is difficult because such researchers often spend a majority of their time performing experiments and use statistical software sparingly. They also often do not have access to a dedicated biostatistician in their office or are competing for the time of their office’s single biostatistician. Although such researchers might know the mechanics of a statistical method, they might not how to generate meaningful results using software.
Therefore, a practical, how-to guide to biostatistics was needed. There are many software applications available for statistics and biostatistics, so why JMP? As an educator, I found JMP an advantage to teaching. I could spend more time on theory and interpretation because JMP does not require scripts and syntax. As a collaborator and consultant, I found my colleagues would readily gravitate toward JMP and its results because of the graphical user interface (GUI) format and its ease of use. And finally, unless you want to code algorithms themselves, as a researcher, you will find JMP to be more user-friendly, correct, and developed when compared to many other competing packages. Incidentally, if you want to code, SAS programming abilities do exist in JMP. Thus, you can fully use JMP for analysis ranging from simple to complex and customized.
This book presents and solves problems germane to biostatistics with easy-to-reproduce examples. The book is also a general biostatistics reference that leverages the topics found in leading biostatistics books. This chapter introduces JMP, presents a general outline of the book contents, and provides a brief guide to using this book.
1.2 Getting Started with JMP
When you first run JMP, you will be greeted with a Tip of the Day (Figure 1.1). There are 62 tips of the day, and they show up whenever you start JMP. These tips can be useful to new JMP users in gaining familiarity with the software. However, if you don’t want to see these tips further, you can do the following:
1. Clear Show tips at start-up.
2. Click Close.
Figure 1.1 Initial Tip of the Day
After you close the Tip of the Day, you are greeted with the primary JMP interface seen in Figure 1.2. Here, you can load data, create a new data table, or look for recently used files. If this is the first time you have opened JMP, there will be no recent files to consider. Thus, you must load or create a new data table.
To load a file:
● Click File ► Open.
or
● Click on the third icon on the taskbar.
To create a blank data table:
● Click File ► New.
or
● Click on the first icon on the taskbar.
Alternatively, if you want to load a built in JMP example data file, you can do so. A variety of files are available.
To load example data files:
1. Click Help ► Sample Data.
2. Select a data file under the method of interest.
Also, you can select individual or multiple data tables in the Window List and then close all of these files. This is advantageous if you inadvertently opened many files, such as in a mistakenly setup Internet open.
To close many open data tables:
1. Select the windows of interest.
2. Right-click and select Close.
3. You will then be prompted to save these files.
Figure 1.2 JMP Primary Interface
If you create a new data table, you will be presented with Figure 1.3. Here, you see that there is a spreadsheet-like table, with a Column 1 ready for you to start considering. Also, when you have loaded and analyzed data, you can save these results to the JMP data table and instantly reload at a later date, as will be discussed in Section 14.2.
Figure 1.3 New Data Table
1.3 General Outline
With this basic usability knowledge from Section 1.2, you are now ready to consider bio-statistical data analysis. Biostatistics covers a wide variety of topics ranging from simple hypothesis tests to complex nonlinear algorithms. This book aims to cover the range of methods with varying levels of detail. To do so, this book is organized sequentially as outlined in Table 1.1.
Table 1.1 General Outline of Biostatistics Using JMP: A Practical Guide
1.4 How to Use This Book
In Chapters 2 and 3, this book moves to data-wrangling issues, such as data collection and cleaning. Since upward of 80% of your time can be spent in making messy data usable (Lohr, 2014), learning the tools JMP has for assembling and cleaning data is key and is covered in Chapters 2 and 3.
In Chapters 4 and 5, you can learn the basics of descriptive statistics and data visualizations in JMP. Following this, the primary focus is on modeling, which involves creating a mathematical representation (a model) of data or a system in order to make inferences about it.
After this, you have a few different paths available:
Chapter 6 discusses epidemiological and geographical interpretations.
Chapter 4 also discusses developing custom equations.
Chapter 7 discusses the various hypothesis test and confidence interval methods.
Chapters 8 to 10 discuss linear models such as analysis of variance (ANOVA), regression, and model validation.
Because of the interrelation of the underlying methods of regression (Chapter 9) and ANOVA (Chapter 8), diagnostic and remedial measures for these methods are discussed in Chapter 10.
Chapter 11 discusses classification methods, such as logistic regression, and clustering methods, such as k-means.
Chapters 7 to 10 largely deal with a continuous dependent (e.g., Y, variable (prediction)). Methods to analyze a discrete dependent variable are presented in Chapter 11.
Chapter 12 presents advanced modeling methods (e.g., factor analysis, neural networks, and control charts).
Chapter 13 introduces the vast array of survival analysis methods in JMP.
Chapter 14 presents methods that facilitate collaborating in addition to sources of additional functionality.
If you are using a previously created data set, then it is advantageous to start with data-wrangling methods and then look at the various analytical tools this book discusses. However, if you are starting a new experiment and will be collecting data, then you should start looking at Section 8.4.1, which discusses experimental design considerations and how to develop and select factor levels for an experiment.
1.5 Reference
Lohr, S. (2014, Aug. 18). For big-data scientists, ‘janitor work’ is key hurdle to insights. New York Times, p. B4.
Chapter 2: Data Wrangling: Data Collection
2.1 Introduction
2.2 Collecting Data from Files
2.2.1 JMP Native Files
2.2.2 SAS Format Files
2.2.3 Excel Spreadsheets
2.2.4 Text and CSV Format
2.3 Extracting Data from Internet Locations
2.3.1 Opening as Data
2.3.2 Opening as a Webpage
2.4 Data Modeling Types
2.4.1 Incorporating Expression and Contextual Data
2.5 References
2.1 Introduction
Many examples in textbooks and manuals such as this book are very orderly and clean. However, real-world data is rarely orderly and clean (authors actually spend a lot of time searching for and fabricating good example data) and up to 80% of your time can be lost in wrangling
to make messy data usable (Lohr, 2016). Medical and biological data is also becoming increasingly big
in nature (Bihl, Young II, and Weckman, 2016), and thus wrangling the data can be of interest (Marx, 2013). To analyze messy real-world data, data wrangling comes into play.
Data wrangling, conceptualized in Figure 2.1, involves taking raw data, extracting and cleaning it, and developing data features for analysis. The boundaries are not always clear when data wrangling ends and when statistical analysis begins. Thus, some overlap exists between data wrangling and statistical analysis when you begin to select/extract/analyze data features (e.g., the columns of a JMP data table).
Figure 2.1 Data-Wrangling Overview, adapted from (Boehmke, 2016)
Data wrangling is not frequently discussed in conjunction with analysis because books, including this one, focus on developing knowledge in using the presented methods. However, it is important to mention data-wrangling issues since the real world is not full of clean, textbook-style data. Thus, this book considers two data-wrangling tasks with Chapter 2 focusing on data collection in JMP and Chapter 3 focusing on data cleaning using JMP tools. Discussion of data analysis methods then makes up the majority Chapters 4 to 13 of this book. This chapter will present data collection methods built into JMP. JMP can load both JMP and SAS format files, in addition to loading Excel and CSV files. Moreover, data can be imported from text files and Internet webpages using the sophisticated data importation tools in JMP.
2.2 Collecting Data from Files
JMP supports opening a wide variety of data file formats, as detailed in Table 2.1 with annotation on, provided that JMP can load or save in that format. By supporting a wide variety of data types, JMP facilitates work on older data sets, with users who do not use JMP, and with a wide audience. To aid in usability, this book will consider examples from some of these data types (*.txt, *.sas7bdat, *.csv).
Table 2.1 Data Files Supported by JMP
2.2.1 JMP Native Files
JMP uses a propriety data format native to JMP software. As with any file format, this format describes how to handle the pieces of the file. However, not all software applications have the keys to unlock all data types. Thus, JMP knows how to assemble a data table from a *.jmp file, but other software applications likely do not. However, various advantages exist to the JMP native format. If you perform an analysis in JMP, you can save this analysis in the JMP file and reload it on a different computer exactly as you left it. Although you could save to a *.csv (or other formats, as shown in Table 2.1), the ability to reload an analysis would be lost by saving outside the *.jmp format.
2.2.1.1 JMP Sample Data
A wide variety of sample data sets are available in JMP to illustrate the application of specific methods and to provide data for analysis and demonstration. In JMP 13, 543 sample data files are included and available for use. Their applications range from simple to complex, and this enables you to become familiar with both JMP data and methods through exploring them.
To access the folder containing all sample data available in your copy of JMP:
1. Select Help ► Sample Data Library.
2. In the Explorer window that opens to the data directory, double-click on the file of interest to load it.
To access the sample data listed by method or domain of interest:
1. Select Help ► Sample Data.
2. Select the method or domain of interest (e.g., click Analysis of Variance).
Note that you can further see the type of method these files best correspond with.
3. Select an example file of interest (e.g., Blood Pressure).
4. A data table for this file is then opened.
2.2.2 SAS Format Files
SAS files are natively supported by JMP. However, a wide variety of SAS file extensions are in use (e.g., *.sas, *.ss2, *.sc2, etc.). Thus, some care and control might be of interest when loading a file from SAS into JMP. For an example, you can load a SAS data file to JMP:
1. Select File ► Open.
2. Find the SAS file of interest and click once on it.
3. After you select a SAS file, JMP enables you to specify how to open the file. (See Figure 2.2.)
4. Click Open when you are satisfied with your selection.
5. You will then be greeted with a JMP data table.
As hinted at in Figure 2.2, JMP knows what to do with a SAS file and thus JMP is asking you whether you want your JMP data table to have column names from the SAS variables names or the SAS variable labels. After making this selection and opening the file, you will be greeted with a JMP data table.
Figure 2.2 Opening a SAS File
2.2.3 Excel Spreadsheets
Microsoft Excel is a spreadsheet software with worldwide use, and thus JMP readily handles the importation of data from Excel spreadsheets. Since Excel can contain data in multiple sheets of the same file, some care and user interaction might be necessary to import data from Excel. An example will be considered using a generic data file ExcelExample.xlsx, which contains nothing more than two columns of user-generated numbers.
For an example, you can load an Excel data file to JMP:
1. Select File ► Open.
2. Find the Excel file of interest and click once on it.
3. After you select an Excel file, JMP enables you to specify how to open the file. (See Figure 2.3.)
a. The key consideration at this point is whether Row 1 values are column/variable names or are numbers.
4. Click Open when you are satisfied with your selection.
5. You will then be greeted with the Excel Import Wizard. (See Figure 2.4.)
a. At the top left, you can view the Excel data.
b. At the top right, select the Excel sheet of interest. (This example has only one sheet with numbers.)
c. Specify needed considerations using the options in the middle (e.g., where the data is located).
6. Click Import when you are satisfied that the view in the top left is what you want the data table to look like.
Figure 2.3 Opening an Excel File
Figure 2.4 Excel Import Wizard Dialog Box
2.2.4 Text and CSV Format
JMP can load text files as well. However, some care must be taken in this process since text files can be highly unstructured. To facilitate this reality, JMP incorporates a variety of options when opening a text file. For this example, a small file, TextExample.txt, was created with two tab-separated columns of data. Loading a *.csv file will be similar.
To load a text file:
1. Select File ► Open.
2. Find the text file of interest and click once on it.
3. After a text file is selected, JMP enables you to specify how to open the file. (See Figure 2.5.)
A variety of options exist, as shown in Figure 2.5. Selecting Data, using Text Import preferences lets JMP use whatever preferences have been selected. Selecting Data, using best guess lets JMP examine the data and use what it believes to be the best approach to open the data. Selecting Plain text into Script window will not load the file as a data table, but as a script text file. Here you can select Data with Preview, which will allow you more control of the text loading and show additional functions that can assist in loading text files.
Figure 2.5 Opening a Text File
After you select Data with Preview and click OK, you are greeted with the Import dialog box, as shown in Figure 2.6. Here, you see the text file that you have loaded. For this text file, you see a blue line separating the Day and Meas columns, indicating which parts of the text file JMP will assign to data table columns. The text file under consideration has columns separated by tabs. JMP was able to detect this and its best guess is that the file is tab delimited. However, you could select a wide variety of delimited or fixed-width fields. In addition, you could have selected a subset of this file or a compatibility standard, as shown in the expanded sections at the bottom of Figure 2.6.
To continue loading:
1. Ensure that the lines for data columns at the top of the Import dialog box are as desired.
2. Click Next.
3. Change the names of the columns, if needed.
4. Click Import.
We are now greeted with the resultant data table as shown in Figure 2.7.
Figure 2.6 The Text File Import Dialog Box with All Fields Displayed
Figure 2.7 Data Table from Text Imported File
2.3 Extracting Data from Internet Locations
Data is not always conveniently in a spreadsheet or file. Frequently, data can be found on webpages, and thus you need to bring the data to the software. Various options exist to accomplish this task, including copying and pasting the data into a spreadsheet. However, this can be inefficient. To improve the process, you can directly load a webpage and extract its data using JMP. To collect data from the Internet in JMP, begin by selecting File ► Internet Open.
The dialog box shown in Figure 2.8 will appear. Type or paste the URL into the field. For this example, consider the following URL:
https://en.wikipedia.org/wiki/Reported_Road_Casualties_Great_Britain
This is a Wikipedia article that has data from road casualties in Great Britain from 1926 to 2015.
Figure 2.8 Internet Open
After you type the URL into the field, you must specify how this will be opened via the Open As drop-down menu. JMP supports three options, listed in Table 2.2. For this example, examine opening as data or text.
Table 2.2 Internet Open Data Options
2.3.1 Opening as Data
If you select the default option, JMP will open the webpage as data. An dialog box like that in Figure 2.9 will appear. This dialog box shows the various items JMP sees as tables:
1. Select the desired table or tables.
2. Click OK.
This example shows only the first available table selected. The resultant data tables will appear, as shown in Figure 2.10.
Figure 2.9 Importing a Table When Opening as Data
Figure 2.10 Imported Data Table
2.3.2 Opening as a Webpage
After JMP opens the URL as a webpage, you can use the page, Figure 2.11, as you could any browser. If you investigate this webpage in the JMP browser, you will find that there are two tables of possible interest on this webpage.
Figure 2.11 Wikipedia Page for Reported Road Casualties in the JMP Browser
After you become familiar with the webpage, perform the following steps to create a data table:
1. Select File ► Import Table as Data Table.
2. An dialog box like that in Figure 12.2 will appear.
3. Select the desired table or tables.
4. Click OK.
If you select both options, JMP will instantiate two data tables: the table seen in Figure 2.10, and the table seen in Figure 2.12, which has additional information. However, you should not begin to investigate the data in Figure 2.12 yet. First, you should review the URL; if you examine the webpage as shown in Figure 2.11, you will find that the data in Figure 2.12 is for the year 2008 only. Second, the Ref. and Note columns of both Figure 2.10 and Figure 2.12 could be either very troublesome or very useful, depending on what you plan to do next.
Figure 2.12 Types of Casualties
2.4 Data Modeling Types
JMP data tables can contain columns with continuous values, categorical (nominal or ordinal) values, and additional types (such as pictures). How these columns are modeled within JMP relates to the data type and modeling type. A variety of available data modeling types and their associated symbols are presented in Figure 2.13 with Figure 2.13a showing modeling types for numeric data and Figure 2.13b showing additional modeling types. Basic data types include numeric, character (for text), row state (as for a column that shows the states of individual rows), and expression (such as pictures).
Figure 2.13 Data Modeling Types in JMP13
The numeric data type should be used for continuous random variables, ordinal and nominal should be used for discrete values (both numeric and text), and none should be used for columns that fit none of these categories. Characters are frequently seen in categorical groups (e.g., a nominal column with gender) and with unstructured text (e.g., raw observational data from a lab technician). Expressions can be useful for contextual data (e.g., you want to include a picture from a stained slide to link the graphical data to the numeric results that you will analyze). Although the focus of this book will be on numeric data and nominal character data, an example will be seen in Section 2.4.1