Professional Documents
Culture Documents
2&3:
Introduction to SQL
Lecture 2
Announcements!
2
Lecture 2
Today’s Lecture
3. Multi-table queries
• ACTIVITY: Multi-table queries!
4
Lecture 2 > Section 1
5
Lecture 2 > Section 1
6
Lecture 2 > Section 1 > SQL
SQL Motivation
• “Not-Yet-SQL?”
Lecture 2 > Section 1 > SQL
Basic SQL
8
Lecture 2 > Section 1 > SQL
SQL Introduction
• SQL is a standard language for querying and manipulating data
SQL stands for
• SQL is a very high-level programming language Structured Query Language
• This works because it is optimized well!
SQL is a…
• Data Definition Language (DDL)
• Define relational schemata
• Create/alter/delete tables and their attributes
10
Lecture 2 > Section 1 > Definitions
Tables in SQL
A relation or table is a
Product multiset of tuples
PName Price Manufacturer
having the attributes
specified by the schema
Gizmo $19.99 GizmoWorks
11
Lecture 2 > Section 1 > Definitions
Tables in SQL
A multiset is an
unordered list (or: a set
Product
with multiple duplicate
PName Price Manufacturer instances allowed)
Gizmo $19.99 GizmoWorks
12
Lecture 2 > Section 1 > Definitions
Tables in SQL
13
Lecture 2 > Section 1 > Definitions
Tables in SQL
Product
PName Price Manufacturer
14
Lecture 2 > Section 1 > Definitions
Tables in SQL
Product
PName Price Manufacturer
The number of
attributes is the arity of
the relation
15
Lecture 2 > Section 1 > Definitions
16
Lecture 2 > Section 1 > Definitions
Table Schemas
• The schema of a table is the table name, its attributes, and their
types:
17
Lecture 2 > Section 1 > Keys & constraints
Key constraints
A key is a minimal subset of attributes that acts as a
unique identifier for tuples in a relation
• i.e. if two tuples agree on the values of the key, then they must be
the same tuple!
Students(sid:string, name:string, gpa: float)
In SQL, we may constrain a column to be NOT NULL, e.g., “name” in this table
Lecture 2 > Section 1 > Keys & constraints
General Constraints
• We can actually specify arbitrary assertions
• E.g. “There cannot be 25 people in the DB class”
ACTIVITY: Activity-2-1.ipynb
22
Lecture 2 > Section 2
2. Single-table queries
23
Lecture 2 > Section 2
24
Lecture 2 > Section 2 > SFW
SQL Query
• Basic form (there are many many more bells and whistles)
SELECT <attributes>
FROM <one or more relations>
WHERE <conditions>
25
Lecture 2 > Section 2 > SFW
SELECT *
FROM Product
WHERE Category = ‘Gadgets’
Notation
28
Lecture 2 > Section 2 > SFW
A Few Details
• SQL commands are case insensitive:
• Same: SELECT, Select, select
• Same: Product, product
29
Lecture 2 > Section 2 > Other operators
SELECT *
FROM Products
WHERE PName LIKE ‘%gizmo%’
30
Lecture 2 > Section 2 > Other operators
32
Lecture 2 > Section 2 > ACTIVITY
ACTIVITY: Activity-2-2.ipynb
33
Lecture 2 > Section 3
3. Multi-table queries
34
Lecture 2 > Section 3
2. Joins: basics
35
Lecture 2 > Section 3 > Foreign Keys
Product
PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
39
Lecture 2 > Section 3 > Joins: Basics
Joins
Product(PName, Price, Category, Manufacturer)
Company(CName, StockPrice, Country) Note: we will often omit
attribute types in schema
Ex: Find all products under $200 manufactured in Japan; definitions for brevity, but
return their names and prices. assume attributes are
always atomic types
40
Lecture 2 > Section 3 > Joins: Basics
Joins
Product(PName, Price, Category, Manufacturer)
Company(CName, StockPrice, Country)
41
Lecture 2 > Section 3 > Joins: Basics
Joins
Product(PName, Price, Category, Manufacturer)
Company(CName, StockPrice, Country)
42
Lecture 2 > Section 3 > Joins: Basics
Joins
Product
Company
PName Price Category Manuf
Gizmo $19 Gadgets GWorks Cname Stock Country
Powergizmo $29 Gadgets GWorks
GWorks 25 USA
Canon 65 Japan
SingleTouch $149 Photography Canon
Hitachi 15 Japan
MultiTouch $203 Household Hitachi
44
Lecture 2 > Section 3 > Joins: Semantics
45
Lecture 2 > Section 3 > Joins: semantics
Answer = {}
for x1 in R1 do
for x2 in R2 do
…..
for xn in Rn do
if Conditions(x1,…, xn)
then Answer = Answer È {(x1.a1, x1.a2, …, xn.ak)}
return Answer
SELECT Country
FROM Product, Company
WHERE Manufacturer=CName AND Category=‘Gadgets’
50
Lecture 2 > Section 3 > Joins: semantics
Lecture 2 > Section 3 > ACTIVITY
ACTIVITY: Lecture-2-3.ipynb
52
Lecture 2 > Section 3 > ACTIVITY
An Unintuitive Query
SELECT DISTINCT R.A
FROM R, S, T
WHERE R.A=S.A OR R.A=T.A
53
Lecture 2 > Section 3 > ACTIVITY
An Unintuitive Query
SELECT DISTINCT R.A
FROM R, S, T
WHERE R.A=S.A OR R.A=T.A
S T
R But what if S = f?
54
Lecture 2 > Section 3 > ACTIVITY
An Unintuitive Query
SELECT DISTINCT R.A
FROM R, S, T
WHERE R.A=S.A OR R.A=T.A
Announcements
57
A note on quality, not quantity:
Today’s Lecture
3. Advanced SQL-izing
• ACTIVITY: Fancy SQL Part II
60
Lecture 3 > Section 1
61
Lecture 3 > Section 1
2. Nested queries
62
Lecture 3 > Section 1 > Set Operators
An Unintuitive Query
SELECT DISTINCT R.A
FROM R, S, T
WHERE R.A=S.A OR R.A=T.A
63
Lecture 3 > Section 1 > Set Operators
An Unintuitive Query
SELECT DISTINCT R.A
FROM R, S, T
WHERE R.A=S.A OR R.A=T.A
S T
R But what if S = f?
64
Lecture 3 > Section 1 > Set Operators
An Unintuitive Query
SELECT DISTINCT R.A
FROM R, S, T
WHERE R.A=S.A OR R.A=T.A
• Semantics:
Joins / cross-products are just nested for
1. Take cross-product
loops (in simplest implementation)!
3. Apply projection 66
Lecture 3 > Section 1 > Set Operators
output = {}
for r in R:
for s in S:
for t in T:
if r[‘A’] == s[‘A’] or r[‘A’] == t[‘A’]:
output.add(r[‘A’])
return list(output)
Can you see now what happens if S = []? See bonus activity on website! 67
Lecture 3 > Section 1 > Set Operators
Multiset Operations
68
Lecture 3 > Section 1 > Set Operators
70
Lecture 3 > Section 1 > Set Operators
For sets,
𝝀 𝒁 = 𝝀 𝑿 + 𝝀 𝒀 this is union
71
Lecture 3 > Section 1 > Set Operators
72
Lecture 3 > Section 1 > Set Operators
SELECT R.A
FROM R, S
WHERE R.A=S.A 𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 ∩ 𝑟. 𝐴 𝑟. 𝐴 = 𝑡. 𝐴}
INTERSECT
SELECT R.A
FROM R, T
WHERE R.A=T.A Q1 Q2
73
Lecture 3 > Section 1 > Set Operators
UNION
𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 ∪ 𝑟. 𝐴 𝑟. 𝐴 = 𝑡. 𝐴}
WHERE R.A=T.A
What if we want
duplicates?
Lecture 3 > Section 1 > Set Operators
UNION ALL
SELECT R.A
FROM R, S
WHERE R.A=S.A
UNION ALL 𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 ∪ 𝑟. 𝐴 𝑟. 𝐴 = 𝑡. 𝐴}
SELECT R.A
FROM R, T
WHERE R.A=T.A Q1 Q2 ALL indicates
Multiset
operations
75
Lecture 3 > Section 1 > Set Operators
EXCEPT
SELECT R.A
FROM R, S
WHERE R.A=S.A
EXCEPT 𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 \{𝑟. 𝐴|𝑟. 𝐴 = 𝑡. 𝐴}
SELECT R.A
FROM R, T
WHERE R.A=T.A Q1 Q2 What is the
multiset version?
76
Lecture 3 > Section 1 > Set Operators
78
Lecture 3 > Section 1 > Set Operators
80
Lecture 3 > Section 1 > Nested Queries
82
Lecture 3 > Section 1 > Nested Queries
Nested Queries
Are these queries equivalent?
SELECT c.city SELECT c.city
FROM Company c FROM Company c,
WHERE c.name IN ( Product pr,
SELECT pr.maker Purchase p
FROM Purchase p, Product pr WHERE c.name = pr.maker
WHERE p.name = pr.product AND pr.name = p.product
AND p.buyer = ‘Joe Blow‘) AND p.buyer = ‘Joe Blow’
Beware of duplicates!
83
Lecture 3 > Section 1 > Nested Queries
Nested Queries
84
Lecture 3 > Section 1 > Nested Queries
88
Lecture 3 > Section 1 > Nested Queries
90
Lecture 3 > Section 1 > ACTIVITY
Activity-3-1.ipynb
91
Lecture 3 > Section 2
92
Lecture 3 > Section 2
2. GROUP BY
93
Lecture 3 > Section 2 > Aggregation
Aggregation
SELECT AVG(price) SELECT COUNT(*)
FROM Product FROM Product
WHERE maker = “Toyota” WHERE year > 1995
94
Lecture 3 > Section 2 > Aggregation
Aggregation: COUNT
We probably want:
More Examples
Purchase(product, date, price, quantity)
96
Lecture 3 > Section 2 > Aggregation
Simple Aggregations
Purchase
Product Date Price Quantity
bagel 10/21 1 20
banana 10/3 0.5 10
banana 10/10 1 10
bagel 10/25 1.50 20
99
Lecture 3 > Section 2 > GROUP BY
100
Lecture 3 > Section 2 > GROUP BY
101
Lecture 3 > Section 2 > GROUP BY
102
Lecture 3 > Section 2 > GROUP BY
HAVING Clause
Why?
• S = Can ONLY contain attributes a1,…,ak and/or aggregates over other attributes
• C1 = is any condition on the attributes in R1,…,Rn
• C2 = is any condition on the aggregate expressions
105
Lecture 3 > Section 2 > GROUP BY
Evaluation steps:
1. Evaluate FROM-WHERE: apply condition C1 on the
attributes in R1,…,Rn
2. GROUP BY the attributes a1,…,ak
3. Apply condition C2 to each group (may have aggregates)
4. Compute aggregates in S and return the result
106
Lecture 3 > Section 2 > GROUP BY
108
Lecture 3 > Section 2 > GROUP BY
• Attempt #2- With group-by: How about when written this way?
Activity-3-2.ipynb
110
Lecture 3 > Section 3
3. Advanced SQL-izing
111
Lecture 3 > Section 3
2. NULLs
3. Outer Joins
112
Lecture 3 > Section 3 > Quantifiers
Quantifiers
Product(name, price, company)
Company(name, city)
An existential quantifier is a
logical quantifier (roughly) Existential: easy ! J
of the form “there exists”
113
Lecture 3 > Section 3 > Quantifiers
Quantifiers
Find all companies
Product(name, price, company) with products all
Company(name, city) having price < 100
Equivalent
SELECT DISTINCT Company.cname
FROM Company
WHERE Company.name NOT IN( Find all companies
SELECT Product.company that make only
FROM Product.price >= 100) products with price
< 100
A universal quantifier is of
the form “for all” Universal: hard ! L
114
Lecture 3 > Section 3 > NULLs
NULLS in SQL
• Whenever we don’t have a value, we can put a NULL
• The schema specifies for each attribute if can be null (nullable attribute) or
not
Null Values
• For numerical operations, NULL -> NULL:
• If x = NULL then 4*(3-x)/7 is still NULL
FALSE = 0
UNKNOWN = 0.5
TRUE = 1
116
Lecture 3 > Section 3 > NULLs
Null Values
• C1 AND C2 = min(C1, C2)
• C1 OR C2 = max(C1, C2)
• NOT C1 = 1 – C1
Null Values
Unexpected behavior:
SELECT *
FROM Person
WHERE age < 25 OR age >= 25
118
Lecture 3 > Section 3 > NULLs
Null Values
Can test for NULL explicitly:
• x IS NULL
• x IS NOT NULL
SELECT *
FROM Person
WHERE age < 25 OR age >= 25
OR age IS NULL
119
Lecture 3 > Section 3 > Outer Joins
120
Lecture 3 > Section 3 > Outer Joins
However: Products that never sold (with no Purchase tuple) will be lost!
121
Lecture 3 > Section 3 > Outer Joins
Outer Joins
• An outer join returns tuples from the joined relations that don’t have a
corresponding tuple in the other relations
• I.e. If we join relations A and B on a.X = b.X, and there is an entry in A with X=5, but
none in B with X=5…
• A LEFT OUTER JOIN will return a tuple (a, NULL)!
INNER JOIN:
Product Purchase
name category prodName store
name store
SELECT Product.name, Purchase.store
FROM Product Gizmo Wiz
INNER JOIN Purchase
ON Product.name = Purchase.prodName Camera Ritz
Camera Wiz
Note: another equivalent way to write an
INNER JOIN!
123
Lecture 3 > Section 3 > Outer Joins
name store
125
Lecture 3 > Section 3 > ACTIVITY
Activity-3-3.ipynb
126
Lecture 2 & 3 > SUMMARY
Summary
127