CS 448/688

Assignment #1 Solution

Notes:

There are typically several possible correct solutions for each question (but there are many more possible incorrect solutions).
For simplicity, throughout this solution, the symbol ¥ indicates JOIN.

Ex 4.5.2

a) p_ename (s_{aname=’Boeing’}( Employees ¥ Certified ¥ Aircrafts ) )

b) { t[ename] | e Î Employees Ù $c (c Î Certified Ù c[eid] = e[eid] Ù

$a (a Î Aircrafts Ù a[aid] = c[aid] Ù a[aname] = ‘Boeing’))}

Employees	eid	ename	salary
	_EID	P.ename

Aircraft	aid	aname	cruisingrange
	_AID	‘Boeing’

Certified	eid	aid
	_EID	_AID

Ex 4.5.4

a) People have interpreted this in different ways. Some (including the book’s authors) have interpreted it as requesting flights that can be flown by some pilot, in which case a solution is as follows:

p_flno(s_{cruisingrange}_³_distance_Ù_{salary > 100,000} ( Employees ¥ Certified ¥ Aircrafts ¥ Flights))

The proper answers use division, since you want all the high-paid pilots:

p_{flno, eid,ename,salary}(s_{cruisingrange}_³_distance (( Employees ¥ Certified ¥ Aircrafts) ´ Flight) )

¸ s_{salary > 100,000}(Employees)

(Note that since Flight has no common attributes with the other relations, natural join will provide the same result as cross-product.)

b) For the first form, we get the following:

{ f[flno] | f Î Flights Ù $a, c, e (a Î Aircraft Ù c Î Certified Ù e Î Employees Ù

a[cruisingrange] ³ f[distance] Ù e[salary] > 10000 Ù a[aid] = c[aid] Ù e[eid] = c[eid])}

For the second form, we get the following:

{f[flno] | f Î Flights Ù "e (e Î Employees Ù e[salary] > 100,000 Ù

$c (c Î Certified Ù c[eid] = e[eid] Ù

$a (a Î Aircraft Ù c[aid] = a[aid] Ù f[distance] £ a[cruisingrange]) ) ) ) }

Employees	eid	ename	salary
	_EID1		>100,000
	_EID2		>100,000

Certified	eid	aid
	_EID1	_AID

Aircraft	aid	aname	cruisingrange
	_AID		_CR

Flight	flno	from	to	distance	departs	arrives
	P.G.			_Dist

Conditions

_CR ³ _Dist

COUNT._EID1 = COUNT._EID2

Ex 4.5.6

a) p_eid (Employee) – p_E1.eid(r(E1,Employees) ¥_{E1.salary < E2.salary} r(E2,Employees))

b) { e1[eid] | e1Î Employees Ù "e2 (e2 Î Employees Þ e2[salary] £ e1[salary])}

{e1[eid] | e1Î Employees Ù Ø$e2 (e2 Î Employees Ù e2[salary]> e1[salary])}

Employees	eid	ename	salary
	P.		_S1
			_S2

Conditions

S1 = MAX.S2

Ex 4.5.8

a) This query requires us to count the number of aircraft each pilot is certified for; however, the COUNT operation is not provided in the relational algebra. Thus this query is not expressible in relational algebra.

b) Same as part a), we don’t have the required COUNT operation in tuple relational calculus, so this query is not expressible in tuple relational calculus.

c) This is possible to express in QBE, but it is tricky to compare the results of aggregated values. One method is to use an intermediate relation (as described in Section 6.9 in the text):

Certified	eid	aid		Counts	eid	num
	G._EID	_AID		I.	_EID	COUNT._AID

This defines a new relation and inserts corresponding tuples. Then we can write:

Counts	eid	num
	P.	_CNT
		_ ALLCNT

Conditions

CNT = MAX.ALLCNT

Ex 4.5.10

a) We require a SUM operation to do this query, but it is not provided in the relational algebra, so this query is not expressible in relational algebra.

b) Same as part a), we don’t have the required SUM operation in tuple relational calculus, so this query is not expressible in tuple relational calculus.

Employees	eid	ename	salary
			_S	P.SUM._S

2. Ex 4.4.1

a) This query will produce an empty result or will be declared wrong by the compiler. This is because the projection of Parts on ‘sid’ does not make sense.

One possibility is to try a direct translation, but insert an impossible condition on the tuple variable ranging over Parts, as here:

{ s[sname] | s Î Suppliers Ù $p (p Î Parts Ù p[color] = ‘red’ Ù p Ï Parts Ù

$c (c Î Catalog Ù c[cost] < 100 Ù c[sid] = s[sid]))}

If we interpret the question as having a typo, and that the projection was meant to be on ‘pid’, we get:

{ s[sname] | s Î Suppliers Ù $p (p Î Parts Ù p[color] = ‘red’ Ù

$c (c Î Catalog Ù c[cost] < 100 Ù c[sid] = s[sid] Ù c[sid] = p[pid]))}

b) This cannot be expressed in QBE, as there is no ‘sid’ attribute in the Parts table. However, if we again treated it as a typo we would have gotten:

Suppliers	sid	sname	address
	_SID1	P._S

Catalog	sid	pid	cost
	_SID1	_PID1	<100

Parts	pid	pname	colour
	_PID1		red

Ex 4.4.3

a) { s[sname] | s Î Suppliers Ù $p (p Î Parts Ù p[color] = ‘red’ Ù

$c (c Î Catalog Ù c[cost] < 100 Ù c[pid] = p[pid] Ù s[sid] = c[sid])) Ù

$p2 (p2 Î Parts Ù p2[color] = ‘green’ Ù

$c2 (c2 Î Catalog Ù c2[cost] < 100 Ù c2[pid] = p2[pid] Ù

$s2 (s2 Î Supplies Ù s2[sid] = c2[sid] Ù s[sname] = s2[sname] ))) }

Suppliers	sid	sname	address
	_SID1	P._S
	_SID2	_S

Catalog	sid	pid	cost
	_SID1	_PID1	<100
	_SID2	_PID2	<100

Parts	pid	pname	colour
	_PID1		red
	_PID2		green

Ex 4.4.5

a) { s[sname] | s Î Suppliers Ù $p (p Î Parts Ù p[color] = ‘red’ Ù

$c (c Î Catalog Ù c[cost] < 100 Ù c[pid] = p[pid] Ù s[sid] = c[sid] ) ) Ù

$p2 (p2 Î Parts Ù p2 [color] = ‘green’ Ù

$c2 (c2 Î Catalog Ù c2[cost] < 100 Ù c2[pid] = p2[pid] Ù

$s2 (s2 Î Supplies Ù s2[sid] = c2[sid] Ù s[sname] = s2[sname] Ù

s[sid] = s2[sid]) ) ) }

Suppliers	sid	sname	address
	_SID	P.

Catalog	sid	pid	cost
	_SID	_PID1	<100
	_SID	_PID2	<100

Parts	pid	pname	colour
	_PID1		red
	_PID2		green

a) p _ENAME,RESP(s_DUR>12(EMP ¥ WORKS) )

b) p _PNAME,RESP( s _DUR>12_Ú_{BUDGET > 20000} (WORKS ¥ PROJ ) )

c) p _PNAME,ENAME( EMP ¥ PROJ ¥ (p _ENO,PNO(WORKS) –

p _ENO,PNO(r(A,WORKS) ¥_{(A.RESP = B.RESP}_Ù_{A.DUR< B.DUR)}r(B,WORKS) ) ) )

a) Note that we used a different notation to fit everything on one page.

Note: There is an additional unstated constraint that the time slots related to each TV channel must be pairwise non-overlapping.

b) Note: Primary keys are underlined and foreign keys are italic

Step 1 - Handling Entities:

TIME_SLOT(ID,start_time,end_time)

TV_CHANNEL(number, city)

TV_PROGRAM(name, genre, intended_audience, rating)

PERSON(name, date_of_birth,sex)

NETWORK(name)

Step 4 - 1:N Relationships

Modify relations to include foreign keys:

TV_PROGRAM(name, genre, intended_audience, rating, director_name: f.k. to DIRECTOR.name)

ACTOR(name, exclusive_network: f.k. to NETWORK.name)

TV_CHANNEL(number, city, type, owner, network_name: f.k. to NETWORK.name)

Add the constraint that network_name is not null iff type = “network”

Step 5: M:N Relationships

Add further tables:

STARS_IN(program_name: f.k. to TV_PROGRAM.name, actor_name: f.k. to ACTOR.name)

PROGRAM_SLOT(time_slot: f.k. to TIME_SLOT.ID, channel: f.k. to TV_CHANNEL.number)

Step 8: Specialization

When we use general specialization for Person we get:

ACTOR(name)

DIRECTOR(name)

Add the constraint that Actor.name and Director.name should be in Person.name

When we use disjoint specialization for the TV_Channel we get:

TV_CHANNEL(number, city, type, owner)

Add the constraint that type is either “network” or “independent”

Step 9: Aggregation

APPEARS_ON(time_slot: f.k. to TIME_SLOT.ID, channel: f.k. to TV_CHANNEL.number, program_name: f.k. to TV_PROGRAM.name)

Final set of relations:

TIME_SLOT (ID, start_time, end_time)

TV_CHANNEL(number, city, type, owner, network_name)

PROGRAM_SLOT (time_slot, channel)

TV_PROGRAM (name, genre, intended_audience, rating, director_name)

APPEARS_ON (time_slot, channel, program_name)

PERSON (name, date_of_birth, sex)

DIRECTOR (name)

ACTOR (name, exclusive_network)

STARS_IN (program_name, actor_name)

NETWORK (name)

As well as key constraints and foreign key constraints, we also have various domain constraints (i.e., whatever datatypes we would wish to associate with each attribute) plus inclusion dependencies between DIRECTOR.name and PERSON.name and between ACTOR.name and PERSON.name. We also have further constraint that TV_CHANNEL.network_name is not null iff TV_CHANNEL.type = “network”. (Note that because of this constraint we could eliminate the attribute TV_CHANNEL.type and use a test for NULL in TV_CHANNEL.network_name as the subtype discriminator.)