You are on page 1of 71

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Matplotlib
A tutorial
Devert Alexandre
School of Software Engineering of USTC

30 November 2012 Slide 1/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Table of Contents
1

First steps

Curve plots

Scatter plots

Boxplots

Histograms

Usage example

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 2/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Curve plot
Lets plot a curve
i m p o r t math
import m a t p l o t l i b . pyplot as p l t

# Generate a s i n u s o i d
n b S a m p l e s = 256
xRange = (math . p i , math . p i )
x , y = [] , []
f o r n i n xrange ( nbSamples ) :
k = ( n + 0 . 5 ) / nbSamples
x . append ( xRange [ 0 ] + ( xRange [ 1 ] xRange [ 0 ] ) k )
y . append ( math . s i n ( x [ 1 ] ) )
# Plot the s i n u s o i d
plt . plot (x , y)
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 3/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Curve plot
This will show you something like this
1.0

0.5

0.0

0.5

1.0 4

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 4/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

numpy
matplotlib can work with numpy arrays
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# Generate a s i n u s o i d
n b S a m p l e s = 256
xRange = (math . p i , math . p i )
x , y = numpy . z e r o s ( n b S a m p l e s ) , numpy . z e r o s ( n b S a m p l e s )
f o r n i n xrange ( nbSamples ) :
k = ( n + 0 . 5 ) / nbSamples
x [ n ] = xRange [ 0 ] + ( xRange [ 1 ] xRange [ 0 ] ) k
y [ n ] = math . s i n ( x [ n ] )
# Plot the s i n u s o i d
plt . plot (x , y)
p l t . show ( )

numpy provides a lot of function and is efficient


Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 5/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

numpy
zeros build arrays filled of 0
linspace build arrays filled with an arithmetic

sequence
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# Generate a s i n u s o i d
x = numpy . l i n s p a c e (math . p i , math . p i , num=256)
y = numpy . z e r o s ( n b S a m p l e s )
f o r n i n xrange ( nbSamples ) :
y [ n ] = math . s i n ( x [ n ] )
# Plot the s i n u s o i d
plt . plot (x , y)
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 6/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

numpy
numpy functions can work on entire arrays
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# Generate a s i n u s o i d
x = numpy . l i n s p a c e (math . p i , math . p i , num=256)
y = numpy . s i n ( x )
# Plot the s i n u s o i d
plt . plot (x , y)
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 7/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

PDF output
Exporting to a PDF file is just one change
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# Generate a s i n u s o i d
x = numpy . l i n s p a c e (math . p i , math . p i , num=256)
y = numpy . s i n ( x )
# Plot the s i n u s o i d
plt . plot (x , y)
p l t . s a v e f i g ( s i n p l o t . p d f , t r a n s p a r e n t=True )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 8/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Table of Contents
1

First steps

Curve plots

Scatter plots

Boxplots

Histograms

Usage example

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 9/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Multiple curves
Its often convenient to show several curves in one figure
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

#
x
y
z

G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=256)
= numpy . s i n ( x )
= numpy . c o s ( x )

# Plot the curves


plt . plot (x , y)
plt . plot (x , z)
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 10/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Multiple curves
Its often convenient to show several curves in one figure
1.0

0.5

0.0

0.5

1.0 4

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 10/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Custom colors
Changing colors can help to make nice documents
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

#
x
y
z

G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=256)
= numpy . s i n ( x )
= numpy . c o s ( x )

# Plot the curves


p l t . p l o t ( x , y , c= #FF4500 )
p l t . p l o t ( x , z , c= #4682B4 )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 11/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Custom colors
Changing colors can help to make nice documents
1.0

0.5

0.0

0.5

1.0 4

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 11/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Line thickness
Line thickness can be changed as well
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

#
x
y
z

G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=256)
= numpy . s i n ( x )
= numpy . c o s ( x )

# Plot the curves


p l t . p l o t ( x , y , l i n e w i d t h =3 , c= #FF4500 )
p l t . p l o t ( x , z , c= #4682B4 )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 12/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Line thickness
Line thickness can be changed as well
1.0

0.5

0.0

0.5

1.0 4

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 12/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Line patterns
For printed document, colors can be replaced by line
patterns
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# L i n e s t y l e s can be , , . ,
#
x
y
z

G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=256)
= numpy . s i n ( x )
= numpy . c o s ( x )

# Plot the curves


p l t . p l o t ( x , y , l i n e s t y l e = , c= #000000 )
p l t . p l o t ( x , z , c= #808080 )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 13/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Line patterns
For printed document, colors can be replaced by line
patterns
1.0

0.5

0.0

0.5

1.0 4

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 13/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Markers
It sometime relevant to show the data points
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# M a r k e r s can be
#
x
y
z

. ,

,,

o ,

1 and more

G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=64)
= numpy . s i n ( x )
= numpy . c o s ( x )

# Plot the curves


p l t . p l o t ( x , y , m a r k e r= 1 , m a r k e r s i z e =15 , c= #000000 )
p l t . p l o t ( x , z , c= #000000 )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 14/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Markers
It sometime relevant to show the data points
1.0

0.5

0.0

0.5

1.0 4

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 14/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Legend
A legend can help to make selfexplanatory figures
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# l e g e n d l o c a t i o n can be b e s t ,
#
x
y
z

center ,

left ,

right , etc .

G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=256)
= numpy . s i n ( x )
= numpy . c o s ( x )

# Plot the curves


p l t . p l o t ( x , y , c= #FF4500 , l a b e l= s i n ( x ) )
p l t . p l o t ( x , z , c= #4682B4 , l a b e l= c o s ( x ) )
p l t . l e g e n d ( l o c= b e s t )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 15/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Legend
A legend can help to make selfexplanatory figures
1.0

sin(x)
cos(x)

0.5

0.0

0.5

1.0 4

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 15/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Custom axis scale


Changing the axis scale can improve readability
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# l e g e n d l o c a t i o n can be b e s t ,
#
x
y
z

center ,

left ,

right , etc .

G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=256)
= numpy . s i n ( x )
= numpy . c o s ( x )

# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
a x i s . s e t y l i m ( 0.5 math . p i , 0 . 5 math . p i )
# Plot the curves
p l t . p l o t ( x , y , c= #FF4500 , l a b e l= s i n ( x ) )
p l t . p l o t ( x , z , c= #4682B4 , l a b e l= c o s ( x ) )
p l t . l e g e n d ( l o c= b e s t )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 16/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Custom axis scale


Changing the axis scale can improve readability
1.5

sin(x)
cos(x)

1.0
0.5
0.0
0.5
1.0
1.5

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 16/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Grid
Same goes for a grid, can be helpful
i m p o r t math
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# l e g e n d l o c a t i o n can be b e s t ,
#
x
y
z

center ,

left ,

right , etc .

G e n e r a t e a s i n u s o i d and a c o s i n o i d
= numpy . l i n s p a c e (math . p i , math . p i , num=256)
= numpy . s i n ( x )
= numpy . c o s ( x )

# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
a x i s . s e t y l i m ( 0.5 math . p i , 0 . 5 math . p i )
a x i s . g r i d ( True )
# Plot the curves
p l t . p l o t ( x , y , c= #FF4500 , l a b e l= s i n ( x ) )
p l t . p l o t ( x , z , c= #4682B4 , l a b e l= c o s ( x ) )
p l t . l e g e n d ( l o c= b e s t )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 17/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Grid
Same goes for a grid, can be helpful
1.5

sin(x)
cos(x)

1.0
0.5
0.0
0.5
1.0
1.5

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 17/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Error bars
Your data might come with a known measure error
import
import
import
import

math
numpy
numpy . random
m a t p l o t l i b . pyplot as p l t

# Generate a noisy s i n u s o i d
x = numpy . l i n s p a c e (math . p i , math . p i , num=48)
y = numpy . s i n ( x + 0 . 0 5 numpy . random . s t a n d a r d n o r m a l ( l e n ( x ) ) )
y e r r o r = 0 . 1 numpy . random . s t a n d a r d n o r m a l ( l e n ( x ) )
# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
a x i s . s e t y l i m ( 0.5 math . p i , 0 . 5 math . p i )
# Plot the curves
p l t . p l o t ( x , y , c= #FF4500 )
p l t . e r r o r b a r ( x , y , y e r r=y e r r o r )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 18/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Error bars
Your data might come with a known measure error
1.5
1.0
0.5
0.0
0.5
1.0
1.5

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 18/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Table of Contents
1

First steps

Curve plots

Scatter plots

Boxplots

Histograms

Usage example

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 19/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Scatter plot
A scatter plot just shows one point for each dataset entry

i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100
x = numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )
y = numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )
# Plot the points
plt . scatter (x , y)
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 20/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Scatter plot
A scatter plot just shows one point for each dataset entry

5
4
3
2
1
0
1
2
3
43

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 20/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Aspect ratio
If can be very important to have the same scale on both
axis
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100
x = numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )
y = 0 . 1 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )
# Axis setup
f i g = plt . figure ()
a x i s = f i g . a d d s u b p l o t ( 1 1 1 , a s p e c t= e q u a l )
# Plot the points
p l t . s c a t t e r ( x , y , c= #FF4500 )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 21/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Aspect ratio
If can be very important to have the same scale on both
axis

0.3
0.2
0.1
0.0
0.1
0.2
0.3
3

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 21/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Aspect ratio
Alternative way to keep the same scale on both axis
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100
x = numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )
y = 0 . 1 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s )
# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
cmin , cmax = min ( min ( x ) , min ( y ) ) , max ( max ( x ) , max ( y ) )
cmin = 0 . 0 5 ( cmax cmin )
cmax += 0 . 0 5 ( cmax cmin )
a x i s . s e t x l i m ( cmin , cmax )
a x i s . s e t y l i m ( cmin , cmax )
# Plot the points
p l t . s c a t t e r ( x , y , c= #FF4500 )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 22/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Aspect ratio
Alternative way to keep the same scale on both axis

2
1
0
1
2
2

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 22/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Multiple scatter plots


As for curve, you can show 2 datasets on one figure
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
c o l o r s = ( #FF4500 , #3CB371 , #4682B4 , #DB7093 , #FFD700 )
# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100
x , y = [] ,

[]

x += [ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]
y += [ 0 . 2 5 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]
x += [ 0 . 5 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 3 . 0 ]
y += [ 2 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 2 . 0 ]
# Axis setup
f i g = plt . figure ()
a x i s = f i g . a d d s u b p l o t ( 1 1 1 , a s p e c t= e q u a l )
# Plot the points
f o r i in xrange ( len ( x ) ) :
p l t . s c a t t e r ( x [ i ] , y [ i ] , c=c o l o r s [ i % l e n ( c o l o r s ) ] )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 23/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Multiple scatter plots


As for curve, you can show 2 datasets on one figure
8
6
4
2
0
2
43

1 0

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 23/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Showing centers
It can help to see the centers or the median points
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
c o l o r s = ( #FF4500 , #3CB371 , #4682B4 , #DB7093 , #FFD700 )
# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100
x , y = [] ,

[]

x += [ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]
y += [ 0 . 2 5 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]
x += [ 0 . 5 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 3 . 0 ]
y += [ 2 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 2 . 0 ]
# Axis setup
f i g = plt . figure ()
a x i s = f i g . a d d s u b p l o t ( 1 1 1 , a s p e c t= e q u a l )
# Plot the points
f o r i in xrange ( len ( x ) ) :
col = colors [ i % len ( colors )]
p l t . s c a t t e r ( x [ i ] , y [ i ] , c=c o l )
p l t . s c a t t e r ( [ numpy . median ( x [ i ] ) ] ,

[ numpy . median ( y [ i ] ) ] , c=c o l , s =250)

p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 24/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Showing centers
It can help to see the centers or the median points
8
6
4
2
0
2
43

1 0

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 24/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Marker styles
You can use different markers styles
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t
m a r k e r s = ( + , , . )
# G e n e r a t e a 2d n o r m a l d i s t r i b u t i o n
n b P o i n t s = 100
x , y = [] ,

[]

x += [ numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]
y += [ 0 . 2 5 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) ]
x += [ 0 . 5 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 3 . 0 ]
y += [ 2 numpy . random . s t a n d a r d n o r m a l ( n b P o i n t s ) + 2 . 0 ]
# Axis setup
f i g = plt . figure ()
a x i s = f i g . a d d s u b p l o t ( 1 1 1 , a s p e c t= e q u a l )

# Plot the points


f o r i in xrange ( len ( x ) ) :
m = markers [ i % l en ( markers ) ]
p l t . s c a t t e r ( x [ i ] , y [ i ] , m a r k e r=m, c= #000000 )
p l t . s c a t t e r ( [ numpy . median ( x [ i ] ) ] , [ numpy . median ( y [ i ] ) ] , m a r k e r=m, s =250 , c= #000000
p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 25/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Marker styles
You can use different markers styles
10
8
6
4
2
0
2
44 3 2 1 0 1 2 3 4 5

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 25/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Table of Contents
1

First steps

Curve plots

Scatter plots

Boxplots

Histograms

Usage example

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 26/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Boxplots
Lets do a simple boxplot
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Generate normal d i s t r i b u t i o n data


x = numpy . random . s t a n d a r d n o r m a l ( 2 5 6 )
# Show a b o x p l o t o f t h e d a t a
plt . boxplot (x)
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 27/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Boxplots
Lets do a simple boxplot
3
2
1
0
1
2
3

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 27/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Boxplots
You might want to show the original data in the same
time
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Generate normal d i s t r i b u t i o n data


x = numpy . random . s t a n d a r d n o r m a l ( 2 5 6 )
# Show a b o x p l o t o f t h e d a t a
p l t . s c a t t e r ( [ 0 ] l e n ( x ) , x , c= #4682B4 )
p l t . boxplot (x , p o s i t i o n s =[0])
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 28/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Boxplots
You might want to show the original data in the same
time
4
3
2
1
0
1
2
3
4

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 28/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Multiple boxplots
Boxplots are often used to show side by side various
distributions
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Generate normal d i s t r i b u t i o n data


data = [ ]
f o r i in xrange ( 5 ) :
mu = 10 numpy . random . r a n d o m s a m p l e ( )
s i g m a = 2 numpy . random . r a n d o m s a m p l e ( ) + 0 . 1
d a t a . append ( numpy . random . n o r m a l (mu , sigma , 2 5 6 ) )
# Show a b o x p l o t o f t h e d a t a
p l t . boxplot ( data )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 29/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Multiple boxplots
Boxplots are often used to show side by side various
distributions
14
12
10
8
6
4
2

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 29/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Orientation
Changing the orientation is easily done
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Generate normal d i s t r i b u t i o n data


data = [ ]
f o r i in xrange ( 5 ) :
mu = 10 numpy . random . r a n d o m s a m p l e ( )
s i g m a = 2 numpy . random . r a n d o m s a m p l e ( ) + 0 . 1
d a t a . append ( numpy . random . n o r m a l (mu , sigma , 2 5 6 ) )
# Show a b o x p l o t o f t h e d a t a
p l t . b o x p l o t ( data , v e r t =0)
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 30/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Orientation
Changing the orientation is easily done

5
4
3
2
1
6

10

12

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 30/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Legend
Good graphics have a proper legend
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Generate normal d i s t r i b u t i o n data


l a b e l s = [ m e r c u r y , l e a d , l i t h i u m , t u n g s t e n e , cadnium ]
data = [ ]
f o r i in xrange ( len ( l a b e l s ) ) :
mu = 10 numpy . random . r a n d o m s a m p l e ( ) + 100
s i g m a = 2 numpy . random . r a n d o m s a m p l e ( ) + 0 . 1
d a t a . append ( numpy . random . n o r m a l (mu , sigma , 2 5 6 ) )
# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
axis . s e t t i t l e ( Alien nodules composition )
xtickNames = p l t . setp ( axis , x t i c k l a b e l s = l a b e l s )
a x i s . s e t y l a b e l ( c o n c e n t r a t i o n ( ppm ) )
# Show a b o x p l o t o f t h e d a t a
p l t . boxplot ( data )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 31/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Legend
Good graphics have a proper legend
Alien nodules composition

112

concentration (ppm)

110
108
106
104
102
100

mercury

lead

lithium

tungstene

cadnium

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 31/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Table of Contents
1

First steps

Curve plots

Scatter plots

Boxplots

Histograms

Usage example

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 32/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Histograms
Histogram are convenient to sum-up results
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Some d a t a
d a t a = numpy . a b s ( numpy . random . s t a n d a r d n o r m a l ( 3 0 ) )
# Show an h i s t o g r a m
p l t . b a r h ( r a n g e ( l e n ( d a t a ) ) , data , c o l o r= #4682B4 )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 33/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Histograms
Histogram are convenient to sum-up results
30
25
20
15
10
5
00.0

0.5

1.0

1.5

2.0

2.5

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 33/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Histograms
A variant to show 2 quantities per item on 1 figure
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Some d a t a
data = [ [ ] , [ ] ]
f o r i i n xrange ( l e n ( data ) ) :
d a t a [ i ] = numpy . a b s ( numpy . random . s t a n d a r d n o r m a l ( 3 0 ) )
# Show an h i s t o g r a m
l a b e l s = range ( l e n ( data [ 0 ] ) )
p l t . b a r h ( l a b e l s , d a t a [ 0 ] , c o l o r= #4682B4 )
p l t . b a r h ( l a b e l s , 1 d a t a [ 1 ] , c o l o r= #FF4500 )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 34/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Histograms
A variant to show 2 quantities per item on 1 figure
30
25
20
15
10
5
03

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 34/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Labels
Very often, we need to name items on an histogram
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Some d a t a
names = [ Wang Bu , Cheng Cao , Zhang Xue L i , Tang Wei , Sun Wu Kong ]
marks = 7 numpy . random . r a n d o m s a m p l e ( l e n ( names ) ) + 3
# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
a x i s . s e t x l i m (0 , 10)
p l t . y t i c k s ( numpy . a r a n g e ( l e n ( marks ) ) + 0 . 5 , names )
a x i s . s e t t i t l e ( Datam i n i n g marks )
a x i s . s e t x l a b e l ( mark )
# Show an h i s t o g r a m
p l t . b a r h ( r a n g e ( l e n ( marks ) ) , marks , c o l o r= #4682B4 )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 35/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Labels
Very often, we need to name items on an histogram
Data-mining marks
Sun Wu Kong

Tang Wei

Zhang Xue Li

Cheng Cao

Wang Bu
0

mark

10

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 35/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Error bars
Error bars, to indicate the accuracy of values
i m p o r t numpy
i m p o r t numpy . random
import m a t p l o t l i b . pyplot as p l t

# Some d a t a
names = [ 6809 , 6502 , 8086 , Z80 , RCA1802 ]
s p e e d = 70 numpy . random . r a n d o m s a m p l e ( l e n ( names ) ) + 30
e r r o r = 9 numpy . random . r a n d o m s a m p l e ( l e n ( names ) ) + 1
# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
p l t . y t i c k s ( numpy . a r a n g e ( l e n ( names ) ) + 0 . 5 , names )
a x i s . s e t t i t l e ( 8 b i t s CPU benchmark s t r i n g
axis . set xlabel ( score )

test )

# Show an h i s t o g r a m
p l t . b a r h ( r a n g e ( l e n ( names ) ) , s p e e d , x e r r=e r r o r , c o l o r= #4682B4 )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 36/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Error bars
Error bars, to indicate the accuracy of values
8 bits CPU benchmark - string test
RCA1802

Z80

8086

6502

6809
0

20

40

score

60

80

100

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 36/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Table of Contents
1

First steps

Curve plots

Scatter plots

Boxplots

Histograms

Usage example

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 37/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

SCHOOL OF SOFTWARE ENGINEERING OF USTC

Old Faithful

Lets display some real data: Old Faithful geyser

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 38/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Old Faithful
This way works, but good example of half-done job
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# Read t h e d a t a
d a t a = numpy . l o a d t x t ( . / d a t a s e t s / g e y s e r . d a t )
# Axis setup
f i g = plt . figure ()
a x i s = f i g . a d d s u b p l o t ( 1 1 1 , a s p e c t= e q u a l )
# Plot the points
p l t . s c a t t e r ( d a t a [ : , 0 ] , d a t a [ : , 1 ] , c= #FF4500 )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 39/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Old Faithful
This way works, but good example of half-done job

5.5
5.0
4.5
4.0
3.5
3.0
2.5
2.0
1.530

40

50

60

70

80

90

100

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 39/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Old Faithful
Lets make a more readable figure
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t
# Read t h e d a t a
d a t a = numpy . l o a d t x t ( . / d a t a s e t s / g e y s e r . d a t )
# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
a x i s . s e t t i t l e ( Old F a i t h f u l g e y s e r d a t a s e t )
a x i s . s e t x l a b e l ( w a i t i n g t i m e (mn . ) )
a x i s . s e t y l a b e l ( e r u p t i o n d u r a t i o n (mn . ) )
# Plot the points
p l t . s c a t t e r ( d a t a [ : , 0 ] , d a t a [ : , 1 ] , c= #FF4500 )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 40/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Old Faithful
Lets make a more readable figure
Old Faithful geyser dataset

5.5
5.0

eruption duration (mn.)

4.5
4.0
3.5
3.0
2.5
2.0
1.530

40

50

60
70
waiting time (mn.)

80

90

100

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 40/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

SCHOOL OF SOFTWARE ENGINEERING OF USTC

Mercury & fishes

Lets display more complex data: fishes and mercury


poisoning

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 41/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Mercury & fishes


A first try
i m p o r t numpy
import m a t p l o t l i b . pyplot as p l t

# Read t h e d a t a
d a t a = numpy . l o a d t x t ( . / d a t a s e t s / f i s h . d a t )
# Axis setup
f i g = plt . figure ()
a x i s = f i g . add subplot (111)
a x i s . s e t t i t l e ( North C a r o l i n a f i s h e s mercury c o n c e n t r a t i o n )
a x i s . s e t x l a b e l ( weight (g . ) )
a x i s . s e t y l a b e l ( m e r c u r y c o n c e n t r a t i o n ( ppm ) )
# Plot the points
p l t . s c a t t e r ( d a t a [ : , 3 ] , d a t a [ : , 4 ] , c= #FF4500 )
p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 42/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Mercury & fishes


A first try
4.0

North Carolina fishes mercury concentration

mercury concentration (ppm)

3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.51000

1000

2000
weight (g.)

3000

4000

5000

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 42/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Mercury & fishes


Show more information with subplots

mercury concentration
mercury
(ppm)concentration (ppm)

weight (g.)

North Carolina fishes mercury concentration


5000
4000
3000
2000
1000
0
100020
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.51000
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.520

30

40

1000

30

length (cm.)

50

2000
weight (g.)

40

length (cm.)

60

3000

50

4000

60

70

5000

70

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 43/44

SCHOOL OF SOFTWARE ENGINEERING OF USTC

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Mercury & fishes


Data from one river with its own color

mercury concentration
mercury
(ppm)concentration (ppm)

weight (g.)

North Carolina fishes mercury concentration


5000
4000
3000
2000
1000
0
100020
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.51000
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.520

30

40

1000

30

length (cm.)

50

2000
weight (g.)

40

length (cm.)

60

3000

50

4000

60

70

5000

70

Devert Alexandre (School of Software Engineering of USTC) Matplotlib Slide 44/44

You might also like