You are on page 1of 9

2

Gii thiu ngn ng R


2.1 R l g ?
Ni mt cch ngn gn, R l mt phn mm s dng cho phn tch thng k v
th. Tht ra, v bn cht, R l ngn ng my tnh a nng, c th s dng cho nhiu mc
tiu khc nhau, t tnh ton n gin, ton hc gii tr (recreational mathematics), tnh
ton ma trn (matrix), n cc phn tch thng k phc tp. V l mt ngn ng, cho nn
ngi ta c th s dng R pht trin thnh cc phn mm chuyn mn cho mt vn
tnh ton c bit.
Hai ngi sng to ra R l hai nh thng k hc tn l Ross Ihaka v Robert
Gentleman. K t khi R ra i, rt nhiu nh nghin cu thng k v ton hc trn th
gii ng h v tham gia vo vic pht trin R. Ch trng ca nhng ngi sng to ra
R l theo nh hng m rng (Open Access). Cng mt phn v ch trng ny m R
hon ton min ph. Bt c ai bt c ni no trn th gii u c th truy nhp v ti
ton b m ngun ca R v my tnh ca mnh s dng. Cho n nay, ch qua cha
y 5 nm pht trin, cng ngy cng c nhiu cc nh thng k hc, ton hc, nghin
cu trong mi lnh vc chuyn sang s dng R phn tch d liu khoa hc. Trn
ton cu, c mt mng li gn mt triu ngi s dng R, v con s ny ang tng
theo cp s nhn. C th ni trong vng 10 nm na, chng ta s khng cn n cc
phn mm thng k t tin nh SAS, SPSS hay Stata (cc phn mm ny rt t tin, c
th ln n 100.000 USD mt nm) phn tch thng k na, v tt c cc phn tch
c th tin hnh bng R.
V th, nhng ai lm nghin cu khoa hc, nht l cc nc cn ngho kh nh
nc ta, cn phi hc cch s dng R cho phn tch thng k v th. Bi vit ngn ny
s hng dn bn c cch s dng R. Ti gi nh rng bn c khng bit g v R,
nhng ti k vng bn c bit qua v cch s dng my tnh.

2.2 Ti R xung v ci t vo my tnh


s dng R, vic u tin l chng ta phi ci t R trong my tnh ca mnh.
lm vic ny, ta phi truy nhp vo mng v vo website c tn l Comprehensive R
Archive Network (CRAN) sau y:
http://cran.R-project.org.
Ti liu cn ti v, ty theo phin bn, nhng thng c tn bt u bng mu t
R v s phin bn (version). Chng hn nh phin bn ti s dng vo cui nm 2005 l
2.2.1, nn tn ca ti liu cn ti l:

R-2.2.1-win32.zip
Ti liu ny khong 26 MB, v a ch c th ti l:
http://cran.r-project.org/bin/windows/base/R-2.2.1-win32.exe
Ti website ny, chng ta c th tm thy rt nhiu ti liu ch dn cch s dng
R, trnh , t s ng n cao cp. Nu cha quen vi ting Anh, ti liu ny ca ti
c th cung cp nhng thng tin cn thit s dng m khng cn phi c cc ti liu
khc.
Khi ti R xung my tnh, bc k tip l ci t (set-up) vo my tnh.
lm vic ny, chng ta ch n gin nhn chut vo ti liu trn v lm theo hng dn
cch ci t trn mn hnh. y l mt bc rt n gin, ch cn 1 pht l vic ci t R
c th hon tt.

2.3 Package cho cc phn tch c bit


R cung cp cho chng ta mt ngn ng my tnh v mt s function lm cc
phn tch cn bn v n gin. Nu mun lm nhng phn tch phc tp hn, chng ta
cn phi ti v my tnh mt s package khc. Package l mt phn mm nh c cc
nh thng k pht trin gii quyt mt vn c th, v c th chy trong h thng R.
Chng hn nh phn tch hi qui tuyn tnh, R c function lm s dng cho mc
ch ny, nhng lm cc phn tch su hn v phc tp hn, chng ta cn n cc
package nh lme4. Cc package ny cn phi c ti v my tnh v ci t.
a ch ti cc package vn l: http://cran.r-project.org, ri bm vo phn
Packages xut hin bn tri ca mc lc trang web. Mt s package cn ti v my
tnh s dng cho cc v d trong sch ny l:
Tn package
Trellis
lattice
Hmisc
Design
Epi
epitools
foreign
Rmeta
meta
survival

Chc nng
Dng v th v lm cho th p hn
Dng v th v lm cho th p hn
Mt s phng php m hnh d liu ca F. Harrell
Mt s m hnh thit k nghin cu ca F. Harrell
Dng cho cc phn tch dch t hc
Mt package khc chuyn cho cc phn tch dch t hc
Dng nhp d liu t cc phn mm khc nh
SPSS, Stata, SAS, v.v
Dng cho phn tch tng hp (meta-analysis)
Mt package khc cho phn tch tng hp
Chuyn dng cho phn tch theo m hnh Cox (Coxs
proportional hazard model)

splines
Zelig
genetics
BMA
leaps

Package cho survival vn hnh


Package dng cho cc phn tch thng k trong lnh
vc x hi hc
Package dng cho phn tch s liu di truyn hc
Bayesian Model Average
Package dng cho BMA

2.4 Khi ng v ngng chy R


Sau khi hon tt vic ci t, mt icon

R 2.2.1.lnk

s xut hin trn desktop ca my tnh. n y th chng ta sn sng s dng R. C


th nhp chut vo icon ny v chng ta s c mt window nh sau:

R thng c s dng di dng "command line", c ngha l chng ta phi trc


tip g lnh vo ci prompt mu trn. Cc lnh phi tun th nghim ngt theo vn
phm v ngn ng ca R. C th ni ton b bi vit ny l nhm hng dn bn c
hiu v vit theo ngn ng ca R. Mt trong nhng vn phm ny l R phn bit gia
Library v library. Ni cch khc, R phn bit lnh vit bng ch hoa hay ch
thng. Mt vn phm khc na l khi c hai ch ri nhau, R thng dng du chm

thay vo khong trng, chng hn nh data.frame, t.test, read.table,


v.v iu ny rt quan trng, nu khng s lm mt th gi ca ngi s dng.
Nu lnh g ra ng vn phm th R s cho chng ta mt ci prompt khc hay
cho ra kt qu no (ty theo lnh); nu lnh khng ng vn phm th R s cho ra mt
thng bo ngn l khng ng hay khng hiu. V d, nu chng ta g:
> x <- rnorm(20)
>
th R s hiu v lm theo lnh , ri cho chng ta mt prompt khc: >.
chng ta g:

Nhng nu

> R is great
R s khng ng vi lnh ny, v ngn ng ny khng c trong th vin ca R, mt
thng bo sau y s xut hin:
Error: syntax error
>
Khi mun ri khi R, chng ta c th n gin nhn nt cho (x) bn gc tri ca
window, hay g lnh q().

2.5 Vn phm ngn ng R


Vn phm chung ca R l mt lnh (command) hay function (ti s thnh
thong cp n l hm). M l hm th phi c thng s; cho nn theo sau hm l
nhng thng s m chng ta phi cung cp. Chng hn nh:
> reg <- lm(y ~ x)
th reg l mt object, cn lm l mt hm, v y ~ x l thng s ca hm. Hay:
> setwd(c:/works/stats)
th setwd l mt hm, cn c:/works/stats l thng s ca hm.
bit mt hm cn c nhng thng s no, chng ta dng lnh args(x), (args
vit tt ch arguments) m trong x l mt hm chng ta cn bit:
> args(lm)
function (formula, data, subset, weights, na.action, method = "qr",
model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE,
contrasts = NULL, offset, ...)

NULL

R l mt ngn ng i tng (object oriented language). iu ny c ngha l


cc d liu trong R c cha trong object. nh hng ny cng c vi nh hng n
cch vit ca R. Chng hn nh thay v vit x = 5 nh thng thng chng ta vn vit,
th R yu cu vit l x == 5.
i vi R, x = 5 tng ng vi x <- 5. Cch vit sau (dng k hiu <-)
c khuyn khch hn l cch vit trc (=). Chng hn nh:
> x <- rnorm(10)
c ngha l m phng 10 s liu v cha trong object x. Chng ta cng c th vit x =
rnorm(10).
Mt s k hiu hay dng trong R l:
x == 5
x != 5
y < x
x > y
z <= 7
p >= 1
is.na(x)
A & B
A | B
!

x bng 5
x khng bng 5
y nh hn x
x ln hn y
z nh hn hoc bng 7
p ln hn hoc bng 1
C phi x l bin s missing
A v B (AND)
A hoc B (OR)
Khng l (NOT)

Vi R, tt c cc cu ch hay lnh sau k hiu # u khng c hiu ng, v # l k hiu


dnh cho ngi s dng thm vo cc ghi ch, v d:
> # lnh sau y s m phng 10 gi tr normal
> x <- rnorm(10)

2.6 Cch t tn trong R


t tn mt i tng (object) hay mt bin s (variable) trong R kh linh hot,
v R khng c nhiu gii hn nh cc phn mm khc. Tn mt object phi c vit
lin nhau (tc khng c cch ri bng mt khong trng). Chng hn nh R chp nhn
myobject nhng khng chp nhn my object.
> myobject <- rnorm(10)
> my object <- rnorm(10)
Error: syntax error in "my object"

Nhng i khi tn myobject kh c, cho nn chng ta nn tc ri bng . Nh


my.object.
> my.object <- rnorm(10)
Mt iu quan trng cn lu l R phn bit mu t vit hoa v vit thng. Cho nn
My.object khc vi my.object. V d:
> My.object.u <- 15
> my.object.L <- 5
> My.object.u + my.object.L
[1] 20

Mt vi iu cn lu khi t tn trong R l:

Khng nn t tn mt bin s hay variable bng k hiu _ (underscore) nh


my_object hay my-object.

Khng nn t tn mt object ging nh mt bin s trong mt d liu. V d,


nu chng ta c mt data.frame (d liu hay dataset) vi bin s age trong
, th khng nn c mt object trng tn age, tc l khng nn vit: age <age. Tuy nhin, nu data.frame tn l data th chng ta c th cp n bin
s age vi mt k t $ nh sau: data$age. (Tc l bin s age trong
data.frame data), v trong trng hp , age <- data$age c th chp
nhn c.

2.7 H tr trong R
Ngoi lnh args() R cn cung cp lnh help() ngi s dng c th hiu
vn phm ca tng hm. Chng hn nh mun bit hm lm c nhng thng s
(arguments) no, chng ta ch n gin lnh:
> help(lm)
hay
> ?lm
Mt ca s s hin ra bn phi ca mn hnh ch r cch s dng ra sao v thm ch c c
v d. Bn c c th n gin copy v dn v d vo R xem cch vn hnh.
Trc khi s dng R, ngoi sch ny nu cn bn c c th c qua phn ch dn
c sn trong R bng cch chn mc help v sau chn Html help nh hnh di

y bit thm chi tit. Bn c cng c th copy v dn cc lnh trong mc ny vo R


xem cho bit cch vn hnh ca R.

Thay v chn mc trn, bn c cng c th n gin lnh:


> help.start()
v mt ca s s xut hin ch dn ton b h thng R.
Hm apropos cng rt c ch v n cung cp cho chng ta tt c cc hm trong R bt
u bng k t m chng ta mun tm. Chng hn nh chng ta mun bit hm no trong
R c k t lm th ch n gin lnh:
> apropos(lm)

V R s bo co cc hm vi k t lm nh sau c sn trong R:
[1] ".__C__anova.glm"
[4] ".__C__glm.null"
[7] "anova.glm"
[10] "anova.lmlist"
[13] "contr.helmert"
[16] "glm.fit"
[19] "KalmanForecast"
[22] "KalmanSmooth"
[25] "lm.fit.null"
[28]
"lm.wfit.null"
"model.frame.lm"

".__C__anova.glm.null" ".__C__glm"
".__C__lm"
".__C__mlm"
"anova.glmlist"
"anova.lm"
"anova.mlm"
"anovalist.lm"
"glm"
"glm.control"
"glm.fit.null"
"hatvalues.lm"
"KalmanLike"
"KalmanRun"
"lm"
"lm.fit"
"lm.influence"
"lm.wfit"
"model.frame.glm"

[31]
[34]
[37]
[40]
[43]
[46]
[49]

"model.matrix.lm"
"plot.lm"
"predict.lm"
"print.lm"
"rstandard.glm"
"rstudent.lm"
"summary.mlm"

"nlm"
"plot.mlm"
"predict.mlm"
"residuals.glm"
"rstandard.lm"
"summary.glm"
"kappa.lm"

"nlminb"
"predict.glm"
"print.glm"
"residuals.lm"
"rstudent.glm"
"summary.lm"

2.8 Mi trng vn hnh


D liu phi c cha trong mt khu vc (directory) ca my tnh. Trc khi s
dng R, c l cch hay nht l to ra mt directory cha d liu, chng hn nh
c:\works\stats. R bit d liu nm u, chng ta s dng lnh setwd (set
working directory) nh sau:
> setwd(c:/works/stats)
Lnh trn bo cho R bit l d liu s cha trong directory c tn l
c:\works\stats. Ch rng, R dng forward slash / ch khng phi backward slash \
nh trong h thng Windows.
bit hin nay, R ang lm vic directory no, chng ta ch cn lnh:
> getwd()
[1] "C:/Program Files/R/R-2.2.1"
Ci prompt mc nh ca R l >. Nhng nu chng ta mun c mt prompt
khc theo c tnh c nhn, chng ta c th thay th d dng:
> options(prompt=R> )
R>
Hay:
> options(prompt="Tuan> ")
Tuan>
Mn nh R mc nh l 80 characters, nhng nu chng ta mun mn nh rng
hn, th ch cn ra lnh:
> options(width=100)
Hay mun R trnh by cc s liu dng 3 s thp phn:
> options(scipen=3)

Cc la chn v thay i ny c th dng lnh options(). bit cc thng s hin


ti ca R l g, chng ta ch cn lnh:
> options()
Tm hiu ngy thng:
> Sys.Date()
[1] "2006-03-31"
Nu bn c cn thm thng tin, mt s ti liu trn mng (vit bng ting Anh) cng rt
c ch. Cc ti liu ny c th ti xung my min ph:
R for beginners (ca Emmanuel Paradis):
http://cran.r-project.org/doc/contrib/rdebuts_en.pdf
Using R for data analysis and graphics (ca John Maindonald):
http://cran.r-project.org/doc/contrib/usingR.pdf

You might also like