You are on page 1of 7

PRESENTING AND SUMMARISING DATA Arti dan Kegunaan Data Menurut Websters New World Dictionary,data berarti sesuatu

yang diketahui atau dianggap. Dengan demikian, data dapat memberikan gambaran tentang suatu keadaan atau persoalan. Misalnya harga beras yang bermutu sedang di pasar Senen, Jakarta, pada tanggal 4 Januari 1999, adalah Rp. 450,- per kg. Penyebutan tempat dan waktu ini sangat penting sebab selain data itu (harga beras per kg) akan berubah-ubah dari waktu ke waktu, data juga berbeda-beda menurut tempat.

Data dapat berguna, bila dikaitkan dengan masalah manajemen, sebagai: Dasar suatu perencanaan Alat pengendalian Dasar evaluasi

Kebutuhan terhadap Statistik 1. Menjabarkan dan memahami suatu hubungan 2. Mengambil keputusan yang lebih baik 3. Menangani perubahan

Metodologi Pemecahan Masalah secara Statistik 1. 2. 3. 4. 5. 6. Mengidentifikasi masalah atau peluang Mengumpulkan fakta yang tersedia Mengumpulkan data orisinil yang baru Mengklafisikasikan dan mengikhtisarkan data Menyajikan data Menganalisis data

Syarat Data yang baik dan Pembagian Data Objektif Representatif Kesalahan sampling kecil Tepat waktu Relevan

Mulai

TABEL PENARIKAN DAN PENGORGANISASIAN DATA

Kumpulkan data mentah

Organisasikan data mentah-misalnya, tempatkan dalam satu array,jika perlu

Apakah data sebaiknya diikhtisarkan dan disederhanakan? ya Siapkan distribusi frekuensi dengan mengelompokkan data yang telah diurutkan ke dalam kelas-kelas

Tidak

Siapkan penyajian grafik distribusi frekuensi

Hitung ukuran2 untuk mengikhtisarkan karakteristik kelompok data: Hitung nilai tengah, dan hitung disperse dan kecenderungannya ya

Hitung ukuran2 untuk mengikhtisarkan karakteristik kelompok data: hitung kecenderungan nilai tengah, hitung dispers

Analisis karakteristik yang sedang ditelaah

selesai

Data dapat dikelompokkan, antara lain: 1. Data menurut sifatnya Data menurut sifatnya dibedakan antara data kualitatif dan data kuantitatif. 2. Data menurut sumbernya Data menurut sumbernya mengacu kepada sumber perolehan data, yakni eksternal dan internal. 3. Data menurut cara memperolehnya Berdasarkan cara memperolehnya, data dapat dibedakan antara data primer dan data sekunder. 4. Data menurut waktu pengumpulannya. Berdasarkan waktu pengumpulannya, data dibedakan sebagai data cross section dan data berkala (times series).

Definisi Statistik Dalam arti sempit, statistic berarti data ringkasan berbentuk anga (kuantitatif). Statistik penduduk misalnya, adalah data atau keterangan berbentuk angka ringkasan mengenai penduduk (jumlah, rata-rata umur, distribusinya, persentase penduduk yang buta huruf) dan sebagainya. Dalam arti luas, statistic berarti suatu ilmu yang mempelajari cara pengumpulan, pengolahan/pengelompokan, penyajian dan analisis data serta cara pengambilan kesimpulan dengan memperhitungkan unsure ketidakpastian berdasarkan konsep probabilitas. Suatu definisi yang lebih teoritis sifatnya, diambil dari buku: Statistical Theory in Research, karangan Anderson and Bancrof: Statistics is the science and art of the development and application of the most effective methods of collecting, tabulating, and interpreting quantitative data in such a manner that fallibility of conclusions and estimates may be assessed by means of inductive reasoning based on the mathematics of probability. (Statistika adalah ilmu dan seni pengembangan dan penerapan metode yang paling efektif sehingga kemungkinan kesalahan dalam kesimpulan estimasi dapat diperkirakan dengan menggunakan penalaran induktif berdasarkan matematika probabilitas).

Metode pengumpulan data Sensus Sensus adalah cara pengumpulan data apabila seluruh elemen populasi diselidiki satu per satu. Sampling

Sampling adalah cara pengumpulan data apabila yang diselidiki adalah elemen sampel dari suatu populasi.

Cara Pengambilan Sampel 1. Cara acak (random) adalah suatu cara pemilihan sejumlah elemen dari populasi untuk anggota sampel, di mana pemilihannya dilakukan sedemikian rupa sehingga setiap elemen populasi mendapat kesempatan yang sama (equal chance) untuk dipilih menjadi anggota sampel. Cara ini dianggap objektif karena netral dan samplingnya disebut probability sampling, yaitu setiap elemen populasi mempunyai probabilitas yang sama untuk dipilih. 2. Cara bukan acak (nonrandom) adalah suatu cara pemilihan elemen-elemen dari populasi untuk menjadi anggota sampel di mana setiap elemen tidak mendapat kesempatan yang sama untuk dipilih. Cara ini dianggap subjektif dan samplingnya nonprobability sampling, artinya, setiap elemen tidak mempunyai probabilitas yang sama untuk dipilih. Measuring Location Mean and Weighted Average The mean (also know as average), is obtained by dividing the sum of observed values by the number of observations, n. Although data points fall above, below, or on the mean, it can be considered a good estimate for predicting subsequent data points. The formula for the mean is given below as equation (1). The excel syntax for the mean is AVERAGE(starting cell: ending cell).

(1) where words: is the mathematical notation for the sum of all values (X1, X2,...,Xn). In other

Median The median is the middle value of a set of data containing an odd number of values, or the average of the two middle values of a set of data with an even number of values. The median is especially helpful when separating data into two equal sized bins. The excel syntax to find the median is MEDIAN.

Mode The mode of a set of data is the value which occurs most frequently. The excel syntax for the mode is MODE.

Measuring variability Range As with location, there are a number of different measures of variability. The simplest of these is probably the range, which is the difference between the largest and smallest observation in the dataset. The disadvantage of this measure is that it is based on only two of the observations and may not be representative of the whole dataset, particularly if there are outliers. In addition, it gives no information regarding how the data are distributed between the two extremes. Interquartile range An alternative to the range is the interquartile range. Quartiles are calculated in a similar way to the median; the median splits a dataset into two equally sized groups, tertiles split the data into three (approximately) equally sized groups, quartiles into four, quintiles into five, and so on. The interquartile range is the range between the bottom and top quartiles, and indicates where the middle 50% of the data lie. Like the median, the interquartile range is not influenced by unusually high or low values and may be particularly useful when data are not symmetrically distributed. Ranges based on alternative subdivisions of the data can also be calculated; for example, if the data are split into deciles, 80% of the data will lie between the bottom and top deciles and so on. Standard deviation The standard deviation is a measure of the degree to which individual observations in a dataset deviate from the mean value. Broadly, it is the average deviation from the mean across all observations. It is calculated by squaring the difference of each individual observation from the mean (squared to remove any negative differences), adding them together, dividing by the total number of observations minus 1, and taking the square root of the result. Algebraically the standard deviation for a set of n values (X1,X2,...,Xn} is written as follows:

where and is the mean described above (Eqn 1). It can be seen from this expression that if individual observations are all close to the mean then the standard deviation will be small (at the extreme, if all observations were equal to the mean then the standard deviation would be zero). Conversely, if the observations vary widely then the standard deviation will be substantially larger. The standard deviation summarizes a great deal of information in one number and, like the mean, has useful mathematical properties. Variance

Another measure of variability that may be encountered is the variance. This is simply the square of the standard deviation:

The Sampling Distribution and Standard Deviation of the Mean Population parameters follow all types of distributions, some are normal, others are skewed like the F-distribution and some don't even have defined moments (mean, variance, etc.) like the Chaucy distribution. However, many statistical methodologies, like a z-test (discussed later in this article), are based off of the normal distribution. How does this work? Most sample data are not normally distributed. This highlights a common misunderstanding of those new to statistical inference. The distribution of the population parameter of interest and the sampling distribution are not the same. Sampling distribution?!? What is that? Imagine an engineering is estimating the mean weight of widgets produced in a large batch. The engineer measures the weight of N widgets and calculates the mean. So far, one sample has been taken. The engineer then takes another sample, and another and another continues until a very larger number of samples and thus a larger number of mean sample weights (assume the batch of widgets being sampled from is near infinite for simplicity) have been gathered. The engineer has generated a sample distribution. As the name suggested, a sample distribution is simply a distribution of a particular statistic (calculated for a sample with a set size) for a particular population. In this example, the statistic is mean widget weight and the sample size is N. If the engineer were to plot a histogram of the mean widget weights, he/she would see a bell-shaped distribution. This is because the Central Limit Theorem guarantees that as the sample size approaches infinity, the sampling distributions of statistics calculated from said samples approach the normal distribution. Conveniently, there is a relationship between sample standard deviation () and the standard deviation of the sampling distribution ( - also know as the standard deviation of the mean or standard errordeviation). This relationship is shown in equation (5) below:

(5)

An important feature of the standard deviation of the mean, is the factor in the denominator. As sample size increases, the standard deviation of the mean decrease while the standard deviation, does not change appreciably.

You might also like