1. ER-to-relational Mapping Review
ER的entity对应relational的table;ER entity的attribute对应table的attribute;ER的relationship map到relational的时候尽量减少数据null的可能,同时少创建新的table。(n:m的relationship必须要创建新的table;1:n的relationship要把1的primary key和relationship的attribute都作为foreign key添加进n的table;1:1的优先侧重total participation)
注意:ER map to relational的时候不能把所有关系map过去,比如一些total和partial participation是没有办法兼顾的。
2. from 2-way to n-way relationships
binary relationship根据1:1, 1:n, n:m很容易从ER map,之后是大于2个entity的relationship如何map。通常有两种方式,一种是把relationship当成一个entity将>2的relationship转化成binary,另一种是不当成entity。
(1)不将>2的relationship处理成entity
presribe连接了3个entity,doctor, patient, drug。仿写many to many的binary处理方法,新建一个table放置prescribe关系,将doctor, patient, drug的primary key放入新table作为foreign key,同时存储quantity和date这样的relationship attribute。剩下一个问题是新table的primary key如何设置?一种方法是直接用相连的entity的primary key形成一个key组,但这样的坏处是与relationship实际情况有差别,在这种情况下意味着database中一个病人从一个医生开一种药一生只能有一次。所以推荐另一种方式,根据实际情况来设置。在当前情境下,prescribe一定要有药的dno,由病人取药,事情一定发生在具体的一个date。这样,primary key一定要有date, pid, dno,意味着不同date同一个病人找同一个医生可以开同一种药,doctor没有放入primary key是因为可以通过这组primary key检索到prescribe的doctor。这里不能再省略drug,不然primary key只有date和pid,意味着一个病人在一天内找一个医生只能开一种药。
create domain NameValue as varchar(100) not null;
create table Doctor(
tfn integer,
name NameValue,
specialty text not null,
primary key (tfn)
);
create table Drug(
dno integer,
name NameValue unique,
formula text,
primary key (dno)
);
create table Patient(
pid integer,
name NameValue,
address text not null,
primary key (pid)
);
create table Prescribes(
doctor integer not null references Doctor(tfn),
drug integer references Drug(dno),
patient integer references Patient(pid),
quantity integer not null,
"date" date,
primary key ("date", drug, patient)
);
(2)把relationship当成一个entity将>2的relationship转化成binary
把prescribe的relationship变成了entity,出现3个binary relationship,其中2个是1 to many,不需要新table,另一个many to many,需要一个新的table。
create domain NameValue as varchar(100) not null;
create table Doctor(
tfn integer,
name NameValue,
specialty text not null,
primary key (tfn)
);
create table Drug(
dno integer,
name NameValue unique,
formula text,
primary key (dno)
);
create table Patient(
pid integer,
name NameValue,
address text not null,
primary key (pid)
);
create table Prescription(
prNum integer,
"date" date not null,
doctor integer not null references Doctor(tfn),
patient integer not null references Patient(pid),
primary key (prNum)
);
create table PrescriptionItem(
prescription integer references Prescription(prNum),
drug integer references Drug(dno),
quantity integer check (quantity > 0),
primary key (prescription, drug)
);
3. Mapping composite attributes
上例中,address和name都是composite attributes,如何map取决于实际使用需要。如果需要读取address和name,则第一种写法比较合适,把所有的下层attribute打个包写在address和name中。但是如果需要搜索在某个postcode中的所有人,那么下层的attribute需要被使用,这个时候第二个写法比较合适,把所有下层的attribute分开写入table。
4. Mapping multi-valued attributes
上例中,favourite colours是multi-valued,必须新建一个table,不然一个确定的人在这个数据库中就只能有一个喜欢的颜色。新建的table中不仅要有colour,还要有SSN,而且二者都要是primary key。如果只有SSN是primary key,那么在FavColour table中会出现相同的SSN对应不同的colour,而SSN在person中是primary key,必须是unique的,产生冲突,故SSN和colour都必须是primary key。
5. Derived attributes
上例中,age可以由birthDate计算得出,所以不要写入person的table中
create table Person(
ssn char(20),
familyName varchar(50),
givenName varChar(50) not null,
weight float,
birthDate date,
primary key (ssn)
);
create table PersonHobbies(
person char(20) references Person(ssn),
hobby varchar(50),
primary key (person, hobby)
);
6. Mapping subclasses
通常有三种方法,一种是用ER style,subclass的table中包含subclass的attribute和superclass的primary key;一种是object-oriented style,subclass中包括subclass的attribute和所有superclass的attribute;一种是single table with nulls,扁平化,只有一个table。
(1)ER style
Employee和Manager的table都包含Person的primary key,SSN在Employee和Manager中都作为primary key,但同时也是foreign key连接到Person。
(2)OO style
Employee和Manager的table都包含Person的SSN,但SSN不做搜索用,因为所有信息都已经在Employee和Manager的table中。
(3)Single table with Nulls
一个table的坏处是会出现很多有null的instance,这里如果只有bonus是NULL,则这个人是employee;如果salary, position, bonus都是NULL,则这个人是person。
7. Mapping disjoint subclass
(1)ER style
这个例子中student的subclass是total participation,在ER style中无法体现。
create table Student (
sid integer primary key,
name text,
address text
);
create table Ugrad (
sid integer references Student(sid),
degree text,
primary key (sid)
);
create table Masters (
sid integer references Student(sid),
major text,
primary key (sid)
);
create table Research (
sid integer references Student(sid),
thesis text,
primary key (sid)
);
(2)OO style
这个例子中student的subclass是total participation,在OO style中无法体现。
create table Student (
sid integer primary key,
name text,
address text
);
create table Ugrad (
sid integer references Student(sid),
name text,
address text,
degree text,
primary key (sid)
);
create table Masters (
sid integer references Student(sid),
name text,
address text,
major text,
primary key (sid)
);
create table Research (
sid integer references Student(sid),
name text,
address text,
thesis text,
primary key (sid)
);
(3)Single table with Nulls
这个例子中student的subclass是total participation,在single table中可以表示出来。
create table Student (
sid integer primary key,
name text,
address text,
degree text,
major text,
thesis text,
constraint DisjointTotal check
((degree is not null and major is null and thesis is null)
or
(degree is null and major is not null and thesis is null)
or
(degree is null and major is null and thesis is not null))
);
如果是partial participation
create table Student (
sid integer primary key,
name text,
address text,
degree text,
major text,
thesis text,
constraint DisjointPartial check
((degree is not null and major is null and thesis is null)
or
(degree is null and major is not null and thesis is null)
or
(degree is null and major is null and thesis is not null)
or
(degree is null and major is null and thesis is null))
);
8. Mapping overlapping subclass
ER style和OO style没有办法直接表示出overlapping和disjoint的区别,需要加trigger,比如
stype text not null
check (stype in ('ugrad', 'masters', 'research')),
single table with Nulls, total participation
create table Student (
sid integer primary key,
name text,
address text,
degree text,
major text,
thesis text,
constraint OverlappingTotal check
(degree is not null or major is not null or thesis is not null)
);
single table with Nulls, partial participation
create table Student (
sid integer primary key,
name text,
address text,
degree text,
major text,
thesis text
-- no constraint needed
);
9. Relational DBMSs
Relational DBMSs近似于Relational model,帮助定义domains, attributes, tuples, tables, constraints(domain, key, referential)。区别是它不严格限定table必须要有keys,使用bag semantics而不是set semantics。
bag semantics可以理解为multiset,在集合中允许重复元素,比如{a, a, b}
RDBMS的operation要符合ACID:ACID (atomicity, consistency, isolation, durability) is a set of properties that guarantee database transactions are processed reliably.
10. Using PostgreSQL in CSE
注意使用后要停止server并且登出CSE的主机。
wagner$ ssh YOU@grieg
grieg$ priv srvr
grieg$ source /srvr/YOU/env
grieg$ pg start
grieg$ psql mydb
... do stuff with your database ...
grieg$ pg stop
grieg$ exit
11. Building/Maintaining Databases
Shell commands
createdb dbname
dropdb dbname
psql dynamo -f dumpfile
--restore/dump file
SQL statements
CREATE DATABASE dbname
DROP DATABASE dbname
CREATE TABLE table ( Attributes+Constraints )
ALTER TABLE table TableSchemaChanges
COPY table ( AttributeNames ) FROM STDIN
DROP TABLE table(s)
TRUNCATE TABLE table(s)
--Remove the content of the table, but the table stays
INSERT INTO table ( attrs ) VALUES tuple(s)
DELETE FROM table WHERE condition
UPDATE table SET AttrValueChanges WHERE condition