SQL和Regular Expression专题
1. SQL
1.1简介
SQL包括多种sub-languages,主要是DDL (Data Definition Language)和DML (Data Manipulation Language)。例如,create, alter table都是DDL,data update/insert/delete, query都是DML。DDL管理schema,DML管理sets of tuples。
1.2基本语法
SQL syntax包括:
(1)comment,用--表示
(2)identifers,与常见程序语言不同,双引号也可以使用,case-insensitive。例如,如果一个变量名是date,为了和Built-in type/function区分,可以命名为"date",对不能使用单引号'date',单引号在SQL中表示string
(3)keywords,例如CREATE, DROP, TABLE,一般SQL的keywords都要大写(课程中给的程序这点做的不好)
(4)data types,例如integer, date, serial(是unique的,很适合作为primary key的类型)等
(5)operators,与常见编程语言相似
(6)constants,与常见编程语言相似
SQL语法和常见语言相似,但专注于数据库的处理,没有特别复杂的语法结构和算法。(Oracle允许使用#和$,MySQL允许使用末尾单引号代替双引号)
Conventions:relation names大写首字母,例如Branch;attribute names小写首字母,例如name, code。
1.3 Types/Constants in SQL
(1)Numeric types
注意serial是aotoincrementing integer,适合primary key,不会有重复;注意numeric(precision, scale),由于计算机中的浮点数都是非准确数字,而现实中很多运算必须要求准确,尤其是金融行业,使用numeric能实现,比如很多货币只到小数点后两位,scale可以设置为2,例如23.51符合numeric(4, 2)。
(2)Character
注意Char,必须满足Char的位数,适合证件号之类的数据;注意varChar,数据长度满足1到指定长度,适合名字之类的数据。
注:转义字符以E开头,用\标记。例如E'O\'Brien'等于string O'Brien。
注:类型转换Type-casting,用::标记。例如‘10’ :: integer,把string10转换为integer10。
(3)Boolean
Valid literal values for the "true" state are: TRUE 't' 'true' 'y' 'yes' 'on' '1'
For the "false" state, the following values can be used: FALSE 'f' 'false' 'n' 'no' 'off' '0'
(4)Date/Time
除此之外,还有Monetary Types, Binary Data Types, Enumerated Types, Geometric Types, Network Address Types, Bit String Types, Text Search Types, UUID Types, XML Types, JSON Types, Arrays, Composite Types, Range Types, Object Identifier Types, Pesudo-Types. (https://www.postgresql.org/docs/9.3/static/datatype.html)
(5)自定义
定义domains:
CREATE DOMAIN Name AS Type CHECK (Constraint)
定义tuple types:
CREATE TYPE Name AS ( AttrName AttrType, ... )
如果需要规定排序ording,使用enumerated type:
CREATE TYPE Name AS ENUM ( 'Label', ... )
domain定义练习:
定义positive integers
CREATE DOMAIN PositveInteger AS integer CHECK (value > 0);
定义person's age
CREATE DOMAIN PersonAge AS integer CHECK (value >= 0 and value <= 200);
定义course code
CREATE DOMAIN CourseCode AS char(8) CHECK (value ~ '[A-Z]{4}[0-9]{4}');
定义student ID
CREATE DOMAIN ID AS char(7) CHECK (value ~ '[0-9]{7}')
enumerated types练习:
CREATE DOMAIN SizeValues1 AS
text CHECK (value in ('small','medium','large'));
CREATE TYPE SizeValues2 AS
ENUM ('small','medium','large');
--前者的顺序是medium, large, small;后者的顺序是small, medium, large
tuple练习:
INSERT INTO Student(studeID, name, degree)
VALUES (2177364, 'Jack Smith', 'BSc')
-- tuple literal,values的顺序不能动,要与上面student的属性对应
CONSTRAINT CHECK gender IN ('male','female')
-- set literal,没有顺序,性别顺序不重要
1.4 SQL Operators
(1)常见比较符号
>, <, <=, >=, =, <>
注意:没有==符号,<>的含义是!=
(2)string comparision
1.str1 < str2,按照字典顺序排序
2.str LIKE pattern,match string to pattern
注意:matches anything的符号是% (regular expression中是.*);matches any single char的符号是- (regular expression中是.)
SQL-based pattern matching
name LIKE 'Ja%' --name begins with 'Ja'
name LIKE '_i%' --name has 'i' as 2nd letter
name LIKE '%o%o%' --name contains two 'o's
name LIKE '%ith' --name ends with 'ith'
name LIKE 'John' --name equals 'John'
Regular Expression-based pattern matching:
Attr ~ 'RegExp' or Attr !~ 'RegExp'
Attr ~* 'RegExp' or Attr !~* 'RegExp'
~ matches regular expression, case sensitive
~* matches regular expression, case insensitive
!~ does not match regular expression, case sensitive
!~* does not match regular expression, case insensitive
name ~ '^Ja' --name begins with 'Ja'
name ~ '^.i' --name has 'i' as 2nd letter
name ~ '.*o.*o.*' --name contains two 'o's
name ~ 'ith$' --name ends with 'ith'
name ~ 'John' --name contains 'John'
(3)string manipulation
str1 || str2 ... return concatenation of str1 and str2
例如,a||' '||b的含义是a空格b
lower(str) ... return lower-case version of str
substring(str,start,count) ... extract substring from str
注意:以上,if any operand is NULL, result is NULL
(4)arithmetic operation
+ - * / abs ceil floor power sqrt sin etc.
count(attr) ... number of rows in attr column
sum(attr) ... sum of values for attr
avg(attr) ... mean of values for attr
min/max(attr) ... min/max of values for attr
(5)Null Value Table
使用方法:x IS NULL ;x IS NOT NULL
1.5 Conditional Expressions
coalesce(val1,val2,...valn)
returns first non-null value vali
useful for providing a "displayable" value for nulls
nullif(val1,val2)
returns NULL if val1 is equal to val2
can be used to implement an "inverse" to coalesce
CASE
WHEN test1 THEN result1
WHEN test2 THEN result2
...
ELSE resultn
END
2. SQL Schemas
Relations创建表达
CREATE TABLE RelName (
attribute1 domain1 constraints1,
attribute2 domain2 constraints2,
...
table-level constraints, ...
);
Table Definition表达
create table Students (
id integer, -- e.g. 3123456
familyName text, -- e.g. 'Smith'
givenName text, -- e.g. 'John'
birthDate date, -- e.g. '1-Mar-1984'
degree integer, -- e.g. 3648
wam float, -- e.g. 84.75
primary key (id),
foreign key (degree) references Degrees(id)
);
如何处理互相指向的table数据加载,如
create table R (
id integer primary key,
s char(1) references S(id)
);
create table S (
id char(1) primary key,
r integer references R(id)
);
alter table R add foreign key (s) references S(id) deferrable;
--使用deferrable
3. SQL Queries
(1)基本queries语句
SELECT projectionList
FROM relations/joins
WHERE condition
GROUP BY groupingAttributes
HAVING groupCondition
FROM, WHERE, GROUP BY, HAVING clauses are optional.
(2)Views
A view associates a name with a query:
CREATE VIEW viewName [ ( attributes ) ] AS Query