
Database system
导航
- 导航
- introduction
- chapter2-Relational Model
- chapter3-introduction to SQL
- chapter4-intermediate SQL
- chapter5-advanced SQL
introduction
purpose of database system
- 早期database application建立在file systems上有data redundancy,data isolation,difficulty in accessing data等问题。
- integrity problems(完整性问题)
在数据库中integrity constraints can be stated explicitly. - Atomicity problems(原子性问题)
在数据库中主要通过恢复功能来解决原子性问题。 - concurrent access anomalies(并发访问异常)
对于并发操作若不采取任何限制,可能会导致数据混乱。 - Security problems
数据库系统提供以上问题的解决措施。
Characteristics of Databases
- data persistence
- convenience in accessing data
- data integrity
- data security
- concurrency control for multiple users
- failure recovery
view of data
- 三级抽象databases(Hide the complexities,enhance the adaptation to changes)
- 主要通过两层map来体现数据库的物理数据独立性和逻辑数据独立性。
- view level
- logical level
- physical level
data Models
most important:Relational Model
Database languages
Data Definition Langauge(数据定义语言)
- DDL compiler generates a set of table templates stored in a data dictionary
- Data dictionary contains metadata
Data Manipulation Language(数据操作语言)
SQL:widely used non-procedural language(陈述式语言)
1
2
3select name
from instructor
where instructor.ID = '12345'
Database Design
- Entity Relationship Model(实体-联系模型)
- Normalization Theory
Database Engine
Storage manager
chapter2-Relational Model
Basic Structure
- 定义:Formally, given sets D1, D2, …. Dn a relation r is a subset of D1 x D2 x … x Dn.Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai $\in$ Di。也就是说虽然我们可视化展示的是一个列表,但是本质是没有顺序的,因为是一个集合。
- A1,A2….An are attributes
- R=(A1,A2,..,An) is a relation schema
- Database shcema – is the logic structure of the database.
Attributes
- The set of allowed values for each attribute is called the domain(域) of the attribute.
- 通常值是原子的.
- 在数据库中为每一个域都规定了相同的空值.
keys
- 定义:K$\in$R.K is a superkey(超健) of R if values for K are sufficient to identify a unique tuple of each possible relation r(R).
- Example: {ID} and {ID,name} are both superkeys of instructor.(但是名字不是,因为存在重名的情况)
- Superkey K is a candidate key(候选健) if K is minimal.
- Example: {ID} is a candidate key of instructor.
- One of the candidate keys is selected to be the primary key(主键).
- Foreign key(外键) constraint from attribute(s) A of relation r1 to the primary key B of relation r2 states that on any database instance, the value of A for each tuple in r1 must also be the value of B for some tuple in r2. (参照完整性,类似于外键限制,但是并不一定局限于主键)
解释:由上面的例子,第三排的course中的dept_name就是外键,因为course中的dept_name和department中的dept_name(是主键)是同一个值,所以course中的dept_name就是外键。
解释:红色箭头表示参照完整性,黑键则表示外键。
Relational Algebra
- 输入几个关系,通过某个操作得到一个新的关系。
- Six basic operations:
- select(横向选择)$\theta$
- project(投影某几列,并且结果会去重)$\pi$
- union(对两个关系并起来)
- 两个关系必须有相同的元数(same number of attributes)
- attributes的域需要相容
- set difference -
- 笛卡尔积(表的乘积) x
- rename $\rho$ (当对一个关系想做笛卡尔积的时候,可以对其改名再执行笛卡尔积)
Additional Operations
- 自然联结的例子
- 除法操作
- 结果里面的元组与除数元组连接一定在被除的关系中
Extended Relational-Algebra-Operations
Generalized Projection
Aggregate Functions
chapter3-introduction to SQL
Data Definition Langauge
Domain Types in SQL
- char(n). Fixed length character string, with user-specified length n.(与C语言有差异,因为C语言的字符串的结尾一定为\0,它以字符数组的形式存储,而数据库作为字典直接可以从存储中截取对应长度,不需要额外的标识符)
- varchar(n). Variable length character strings, with user-specified maximum length n.
- int. Integer (a finite subset of the integers that is machine-dependent).
- smallint. Small integer (a machine-dependent subset of the integer domain type).
- numeric(p,d). Fixed point number, with user-specified precision of p digits, with d digits to the right of decimal point.
→ number(3,1) allows 44.5 to be store exactly, but neither 444.5 or 0.32 - real, double precision. Floating point and double-precision floating point numbers, with machine-dependent precision.
- float(n). Floating point number, with user-specified precision of at least n digits.
Built-in Data Types in SQL
- date
- time
- timestamp(data plus time of day)
- interval(period of time)
Create Table Construct
1 |
|
- 再研究key
- not null
- primary key(A1,…,An)
- foreign key(Am,…An) references r(对于被引用的表如果发生delete或者update,可以进行这四个操作:cascade(那个表的值怎样变,此表的值就怎么变), restrict(限制不变), set null, set default)
1 |
|
Basic Query Structure
The select clause
- SQL 大小写不分的
- find the names of all instructors
- select name
- from instructor
The from clause
- The from clause lists the relations involved in the query
- Corresponds to the Cartesian product operation of the relational algebra
Natural join example
1 |
|
string operations
Ordering the display of tuples
1 |
|
Aggregate Functions
- 聚合函数的使用
- 直接使用聚合函数:适用于对整个表或满足某些条件的数据进行聚合操作。
- 结合 GROUP BY 使用聚合函数:适用于按某个字段分组后对每组分别进行聚合操作。
- 因此,是否需要 GROUP BY 取决于具体的业务需求。如果只是对整体数据进行统计,无需分组;如果需要按某些字段分组统计,则必须使用 GROUP BY。
1 |
|
group by clause
- 当我们使用group by的时候是一个聚合操作,所以最终的的select的结果也是以组的形式存在
- ex
having clause
1 |
|
Nested Subqueries
Set membership
- 通过in和not in来实现嵌套操作
1 |
|
set comparison
- 利用some和all与给定集合进行比较
1 |
|
test for empty relations
- 利用exists和not exists
1 |
|
Test for
Modification of the Database
chapter4-intermediate SQL
joined relations
- join types
- inner join(默认的笛卡尔积均是inner join)
- left outer join(from后面的表的所有列均保留,如果没有成功连接上,则置为NULL)
- right outer join(join后面的表的所有列均保留)
- full outer join
- join conditions
- natural join(自然连接)
- on condition(条件连接)
- using(A1,…,An)(等值连接)
- example
- 统计每个老师教的课程(但是可能包含没有课程的老师)
- select id,count(course_id)
- from instructor natural left outer join teaches
- group by id;
- 统计每个老师教的课程(但是可能包含没有课程的老师)
SQL Data Types and schemas
User-defined types
- create type construct in SQL creates user-defined type
- create type Dollars as numeric(12,2) final
domains
- domain is a user-defined type that specifies a set of values and a set of constraints on the values in the set.domains相比于types,domain可以定义一些约束,比如长度,范围,正则表达式等。
- craete domain person_name char(20) not null
integrity constraints
Not NULL and Unique constraints
the check clause
views
- A view provides a mechanism to hide certain data
from the view of certain users.(虚拟的表,没有实际存在) - 好处:简化复杂性,适应能力增加,提供安全权限管理功能。
example views
- a view of instructors without their salary
- create view faculty as
select ID, name, dept_name
from instructor; - 当我们创建好这个view以后,我们可以把这个view当作实体的表进行表的各种操作
- create view faculty as
views defined using other views
- 在定义view的基础上,可以再定义一个view
- create view physics_fall_2009 as
select course.course_id, sec_id, building, room_number
from course, section
where course.course_id = section.course_id
and course.dept_name = ’Physics’
and section.semester = ’Fall’
and section.year = ’2009’; - create view physics_fall_2009_watson as
select course_id, room_number
from physics_fall_2009
where building= ’Watson’; - 这样的操作其实我们可以在一个语句中实现
select course_id, room_number
from physics_fall_2009(已经创建的view)
where building= ’Watson’;
→
select course_id, room_number
from (select course.course_id, building, room_number
from course, section
where course.course_id = section.course_id
and course.dept_name = ’Physics’
and section.semester = ’Fall’
and section.year = ’2009’)(把view创建的语句嵌套在from中)
where building= ’Watson’;
→
select course_id, room_number
from course, section
where course.course_id = section.course_id
and course.dept_name = ’Physics’
and section.semester = ’Fall’
and section.year = ’2009’
and building= ’Watson’;(sql本质运行的一种语句)
update of a view
- view相当于只是打开一扇窗,所以对view的修改,实际上是对view的表进行修改。当然孙老师说,为了提高查询的高效性,有些时候确实会创建一个临时表(materialized view)。
- example
insert into faculty values (’30765’, ’Green’, ’Music’);
insert into instructor values (’30765’, ’Green’, ’Music’, null); - 但是有些时候view的update不能被翻译为对原始表的独立的操作,因为这个view选取的元素可能不是超键。
view and logical data dependence
- 当我们底层的物理数据发生变化,比如一个表分成了两个表,因为我们用户层面可能用到原来的表,那么我们可以利用现在的两个表生成一个代表原来表的view,对于用户层面没有任何变化,但是底层逻辑发生改变。
Indexes
transactions(事务)
- 在数据库系统中,事务是指一组操作的集合,这些操作要么全部执行成功,要么全部不执行,以确保数据的一致性和完整性。
transaction properties
- 原子性(Atomicity):事务是一个不可分割的工作单位,事务中的所有操作要么都做,要么都不做。
- 一致性(Consistency):事务必须使数据库从一个一致状态变换到另一个一致状态。如果事务执行成功,则数据库处于一致状态;如果事务执行失败,则数据库应恢复到事务开始前的状态。
- 隔离性(Isolation):多个事务并发执行时,一个事务的执行不应影响其他事务的执行结果。每个事务都应该独立运行,就好像它是唯一正在运行的事务一样。
- 持久性(Durability):一旦事务提交,它对数据库的更改应该是永久性的,即使系统发生故障。
transaction example
SET AUTOCOMMIT=0;
UPDATE account SET balance=balance -100 WHERE ano=‘1001’;
UPDATE account SET balance=balance+100 WHERE ano=‘1002’;
COMMIT;(上一条事务结束,下一个事务开始)
UPDATE account SET balance=balance -200 WHERE ano=‘1003’;
UPDATE account SET balance=balance+200 WHERE ano=‘1004’;
COMMIT;
UPDATE account SET balance=balance+balance*2.5%;
COMMIT;
transaction operation
- BEGIN TRANSACTION; 或 START TRANSACTION;:显式地开始一个新的事务。
- COMMIT;:提交当前事务,使其所有更改永久生效。
- ROLLBACK;:回滚当前事务,撤销所有未提交的更改,是数据库恢复到事务开始之前的状态。
- SAVEPOINT savepoint_name;:设置一个保存点,可以在部分回滚时使用。
- ROLLBACK TO savepoint_name;:回滚到指定的保存点,而不是整个事务。
Autorization(授权)
- the grant statement is used to confer authorization:grant (privilege list) on (relation name or view name) to (user list)
- user list:
- user-id
- public,which allows all valid the privilege granted
- a role(more on this later)
privileges in SQL
grant select on instructor to U1, U2, U3
grant select on department to public
grant update (budget) on department to U1,U2
grant all privileges on department to U1
- the revoke statement is used to revoke authorization:revoke (privilege list) on (relation name or view name) from (user list)
chapter5-advanced SQL
Accessing SQL from a programming language
- two approaches to access database
- API
- Embedded SQL
API
- ODBC works with C,C++,C#
- JDBC works with Java
- Open a connection
- Create a “statement” object
- Execute queries using the Statement object to send queries and fetch results
- Exception mechanism to handle errors
- Embedded SQL in C
- SQLC-embedded SQL in Java
- JPA-OR mapping of Java
JDBC
JDBC code
1 |
|
Prepared Statement
- 先生成,类型未知,再利用set语句和update语句进行更新
- prepared statement也可以避免SQL注入攻击,因为我们提前进行编译,就会防止产生SQL的歧义。
1 |
|
Metadata
- JDBC可以利用getMetaData()方法来获取数据库的元数据,比如列名,列类型,列数目等。
ODBC(open database connectivity)
ODBC Code
- 完全通过函数调用实现
1 |
|
main body
1 |
|
ODBC prepared statements
Embedded SQL
procedural constructs in SQL
SQL functions
- 定义函数后再调用
1 |
|
SQL procedures
1 |
|
Triggers
- A trigger is a statement that is executed automatically by the system as a side effect of a modification to the database.
- ECA rule
- Event: A modification to the database(insert,delete,update)
- Condition: A condition that must be satisfied for the trigger to execute
- Action: An action to be performed when the condition is satisfied
Trigger example
1 |
|