796.01K

Data. DB. DBMS

1.

Data. DB. DBMS
DB Basics

2.

What is Data?
Minsk, Ivan, 7, Kuprevicha, 80221234567, 220022, Ivanov
0110011101001111110001100111110101110011
TUE
WED
THU
FRI
SAT
18°
25°
20°
23°
21°
5401 0610 0570 0150

3.

What is Information?
Data
Information
Data is a set of facts in a “raw”, nonprocessed state. Data requires processing to
become useful for people.
When data is processed, organized,
structured and can be used by a human or a
PC to gain value / profit.
Minsk, Ivan, 7, Kuprevicha, 80221234567, 220022, Ivanov
processing
Ivan Ivanov
Kuprevicha, 7
Minsk, 220022
8-(022)-1234567

4.

What is DataBase (DB)?
Real life examples:
Booking tickets
Weather prediction
Supermarket purchases
Financial analytics
Instagram
Training courses

5.

Common DataBase Types
Relational DB
Non-Relational DB
Key-Value
Document
Relational Database Management Systems (RDBMS)
Graph

6.

Common DataBase Types
Trainees
Name
Columns
Table
Date of Birth
Gender
Trainees
Education
Prog. experience
Name
Date of Birth
Gender
Education
Prog. experience
Maria Petrova
14.03.1988
F
Completed High
False
Ivan Ivanov
25.05.1979
M
Incomplete High
True





Rows
Cell

7.

Common DataBase Types
Relational database
Key-value database
Table
Collection
Table row
Element of collection
Table column
Key of an element
Table cell
Value by element’s key
"Trainees": [{
"Name":
"Maria Petrova",
"Date of Birth":
"14.03.1988",
"Gender":
"F",
"Education":
"Completed High",
"Prog. experience": "False"
}, {
"Name":
"Ivan Ivanov",
"Date of Birth":
"25.05.1979",
"Gender":
"M",
"Education":
"Incomplete High",
"Prog. experience": "True"
}
]

8.

What is DataBase Management System (DBMS)
Users
DBMS
Other DBMS
define, record,
query, update,
manage data
Applications

9.

What RDBMS is for?
Manage database backup and recovery processes
DBMS

10.

What RDBMS is for?
Restrict data access according to predefined rules
Trainees Performance
10
12
9,8
10
9,6
8
9,4
DBMS
9,2
6
Maria Petrova
Maria Petrova
Ivan Ivanov
9
4
8,8
2
8,6
8,4
0
DB Basics
SQL Foundation
SQL for Analysis

11.

What RDBMS is for?
Allow database access via one of predefined interfaces (most common is SQL query)
SQL
DBMS

12.

What RDBMS is for?
Support data consistency when multiple users work with same pieces of data
Users
Users
DBMS
Users
Users

13.

DB Components
DB Basics

14.

Relational DataBase Components: Table
Name
Date of Birth
Gender
Education
Prog. experience
Hair Color
Favorite Brand
Maria Petrova
14.03.1988
F
Completed High
False
Light brown
Nike
Ivan Ivanov
25.05.1979
M
Incomplete High
True
Blonde
Gant







text date
boolean

15.

Relational DataBase Components: Constraint
Name
Date of Birth
Gender
Education
Prog. experience
Maria Petrova
14.03.1988
F
Completed High
False
Ivan Ivanov
25.05.1979
M
Incomplete High
True
Petr Pirogov
17.11.1987
M
Super High
True
Inna Sidorova
01.01.1992
F
High
Upper High
False
Volha Hlebava
06.06.1991
F
Completed High
Null
Igor Lesik
15.10.2020
15.10.2020
M
Completed High
True
As example: Current Date = 01.01.2020

16.

Relational DataBase Components: Keys
Surrogate
Primary
Primary
Key Key
PassportID
TraineesID
Name
Date of Birth
Gender
Education
Prog. experience
1
MP1111111
Maria Petrova
14.03.1988
F
Completed High
False
MP2222222
2
Ivan Ivanov
25.05.1979
M
Incomplete High
True
3
BM3333333
Ivan Ivanov
25.05.1979
M
Incomplete High
True
?

17.

Relational DataBase Components: Keys
Trainees
Primary
Key
PassportID
Name
Date of Birth
Gender
Education
Prog. experience
MP1111111
Maria Petrova
14.03.1988
F
Completed High
False
BM3333333
Ivan Ivanov
25.05.1979
M
Incomplete High
True
Performance
PassportID
Foreign
Key
Module
Avg. Mark
MP1111111
DB Basics
9.8
BM3333333
DB Basics
9.2
MP1111111
SQL Foundation
8.9
BM3333333
SQL Foundation
8.6
MP1111111
SQL For Analysis
9.2

18.

Relational DataBase Components: relationship 1-to-1
Workstation
Trainees
Trainees
Workstation
PassportID
Name
Date of Birth
Gender
Education
Prog. experience
PassportID
MP1111111
Maria Petrova
14.03.1988
F
Completed High
False
MP1111111
CCCCCCC4321
BM3333333
Ivan Ivanov
25.05.1979
M
Incomplete High
True
BM3333333
CCCCCCC3322
Workstation
CCCCCCC55555

19.

Relational DataBase Components: relationship 1-to-many
Training Group
Trainees
Training Group
GroupID
Group Name
Trainees
GroupID
PassportID
Name
Date of Birth
Gender
Education
Prog. exp
1
Data Training 1
1
MP1111111
Maria Petrova
14.03.1988
F
Completed High
False
2
.Net Training 1
1
BM3333333
Ivan Ivanov
25.05.1979
M
Incomplete High
True
2
КН8888888
Marta Yakova
96.06.1984
F
Completed High
True

20.

Relational DataBase Components: relationship many-to-many
Trainees
Performance
Trainees
Modules
Performance
Modules
PassportID
Name
PassportID
ModuleID
Avg. Mark
ModuleID
MP1111111
Maria Petrova
MP1111111
1
9.8
1
DB Basics
BM3333333
Ivan Ivanov
MP1111111
2
8.9
2
SQL Foundation
MP1111111
3
9.2
3
SQL for Analysis
BM3333333
1
9.2
BM3333333
2
8.6
Module Name

21.

Relational database components
1 Tables
2 Constraints
3 Keys
4 Indexes
5 Views
6 Partitions

22.

DB Modeling
DB Basics

23.

DataBase Structure: Relational Model Terms
Relation
Table
Primary Key
Relation
Key
Trainees
PassportID
Name
Date of Birth
Gender
Education
Prog. experience
MP1111111
Maria Petrova
14.03.1988
F
Completed High
False
MP2222222
Ivan Ivanov
25.05.1979
M
Incomplete High
True
Columns
Attributes
Degree (total # of columns) = 6
Rows
Tuple
Cardinality
(total # of rows) = 2
Attribute
Cell value

24.

DataBase Structure: Rules
Trainees
PassportID
Name
Date of Birth
Gender
Education
Prog. experience
MP1111111
Maria Petrova
14.03.1988
F
Completed High
False
MP2222222
Ivan Ivanov
25.05.1979
M
Incomplete High
True
MP1313311
Petr Pirogov
17.11.1987
M
Completed High
True
BM1112233
Inna Sidorova
01.01.1992
F
Completed High
False

25.

DataBase Structure: Rules
Trainees
PassportID
Name
Date of Birth
Gender
Education
Prog. experience
MP1111111
Maria Petrova
14.03.1988
F
Completed High
False
MP2222222
Ivan Ivanov
25.05.1979
M
Incomplete High
True
MP1313311
Petr Pirogov
17.11.1987
M
Completed High
True
BM1112233
Inna Sidorova
01.01.1992
F
Completed High
False
MP2222222
Ivan Ivanov
25.05.1979
M
Incomplete High
True

26.

DataBase Structure: Rules
Trainees
Trainees
PassportID
Name
DateName
of Birth
Passport
Name
Gender
Education
Prog. experience
MP1111111
Maria Petrova
Maria
14.03.1988
Petrova
F
Completed High
False
MP2222222
Ivan Ivanov
25.05.1979
Ivan Ivanov
M
Incomplete High
True
MP1313311
Petr Pirogov
Petr
17.11.1987
Pirogov
M
Completed High
True
BM1112233
Inna Sidorova
Inna
01.01.1992
Sidorova
F
Completed High
False

27.

DataBase Structure: Rules
Trainings Participations
PassportID
Modules
MP1111111
DB Basics, SQL Foundations, SQL for Analysis
MP2222222
DB Basics, SQL Foundations
Trainings Participations
PassportID
Modules
MP1111111
DB Basics
MP1111111
SQL Foundations
MP1111111
SQL for Analysis
MP2222222
DB Basics
MP2222222
SQL Foundations

28.

DataBase Modeling: Conceptual Model
Trainees
Modules
Includes the important entities and the
relationships among them
No attributes are specified
No keys are specified

29.

DataBase Modeling: Logical Model
Trainees
Performance
Modules
PassportID
PassportID
ModuleID
Name
ModuleID
Module_Name
Date_of_Birth
Avg_Mark
Duration
Prog_Experience
Hair_Color
All attributes are specified
The primary key for each entity specified
Foreign keys are specified
Relationships are specified
Normalization occurs at this level

30.

DataBase Modeling: Physical Model
DIM_TRAINEES
FCT_PERFORMANCE
DIM_MODULES
PASSPORT_ID
INTEGER
PASSPORT_ID
INTEGER
MODULE_ID
INTEGER
NAME
VARCHAR (30)
MODULE_ID
INTEGER
MODULE_NAME
VARCHAR (30)
DATE_OF_BIRTH
DATE
AVG_MARK
DECIMAL (2,1)
DURATION
INTEGER
PROG_EXPERIENCE
BOOLEAN
Convert entities into tables
Convert attributes into columns
Convert relationships into foreign keys
Modify the physical data model based on
physical constraints / requirements

31.

DataBase Modeling: Rules
Trainees
TrainersID
Name
Date_of_Birth
Date of Birth
Gender
Education
Education
Образование
Prog_Experience
MP1111111
Maria Petrova
14.03.1988
F
Completed High
False
MP2222222
Ivan Ivanov
25.05.1979
M
Incomplete High
True
MP1313311
Petr Pirogov
17.11.1987
M
Completed High
True
BM1112233
Inna Sidorova
01.01.1992
F
Completed High
False
Performance
date
text
PassportID
ModuleID
Avg. Mark
MP1111111
1
9.2
MP1111111
2
8.6
decimal
integer
integer
decimal

32.

DataBase Structure: Rules
Tables
Constraints
Primary Key
Foreign Key
Indexes

33.

DataBase Tasks
Transactional App
Analytics
Normalization
Purchases
OLTP
Item
Client
Time
Store
Cashier
Phone
Maria Petrova
no
2019-12-30 16:19:23
Store1
Cashier1
720
Laptop
Gleb Petrov
2019-12-30 16:24:17
Store2
Cashier2
2150
TV
Natalia Sidorova
2019-12-30 16:26:15
Store2
Cashier3
5240
Phone
Ivan Ivanov
2019-12-30 16:29:05
Store1
Cashier4
1650
OLAP
Price Denormalization

34.

Normalization
DB Basics

35.

DataBase Modelling Approaches
Trainees
Modules
PassportID
Name
Date_of_Birth
Gender
Education
Prog_Eexperience
ModuleID
MP1111111
Mariya
Maria
Petrova
Maria Petrovskaya
14.03.1988
F
Completed High
False
1
DB Basics
2
SQL Foundation
3
SQL for Analysis
Performance
PassportID
Name
Date_of_Birth
ModuleID
Module_Name
Avg_Mark
MP1111111
Mariya
MariaPetrovskaya
Petrova
14.03.1988
1
DB Basics
9.8
MP1111111
Mariya
Petrovskaya
Maria
Petrova
Maria
14.03.1988
2
SQL Foundation
8.9
MP1111111
Maria
Petrova
Petrova
Mariya
Petrovskaya
14.03.1988
3
SQL for Analysis
9.2
Module_Name

36.

Conversion to 1NF
Trainees
PassportID
Name
Phone_Number
Date_of_Birth
Gender
Education
Prog_experience
MP1111111
Maria Petrova
375291111111
14.03.1988
F
Completed High
False
BM3333333
Ivan Ivanov
375292222222,
375253333333
25.05.1979
M
Incomplete High
True
Trainees
PassportID
Name
Phone_Number
Date_of_Birth
Gender
Education
Prog_experience
MP1111111
Maria Petrova
375291111111
14.03.1988
F
Completed High
False
BM3333333
Ivan Ivanov
375292222222
25.05.1979
M
Incomplete High
True
BM3333333
Ivan Ivanov
375253333333
25.05.1979
M
Incomplete High
True

37.

Conversion from 1NF to 2NF

38.

Conversion from 1NF to 2NF
Trainees
Trainees_Modules
ModuleName
ModuleDuration
PassportID
Name
PassportID
ModuleID
MP1111111
Maria Petrova
MP1111111
1
DB Basics
1
BM3333333
Ivan Ivanov
MP1111111
2
SQL Foundation
4
MP1111111
3
SQL for Analysis
4
BM3333333
1
DB Basics
1
Trainees
PassportID
Name
PassportID
ModuleID
MP1111111
Maria Petrova
MP1111111
1
MP1111111
2
MP1111111
3
BM3333333
1
BM3333333
Ivan Ivanov
Modules
Trainees_Modules
ModuleID
ModuleName
ModuleDuration
1
DB Basics
1
2
SQL Foundation
4
3
SQL for Analysis
4

39.

Conversion from 2NF to 3NF

40.

Conversion from 2NF to 3NF
Modules Modules
ModuleID
ModuleName
ModuleDuration
Address
AddressPhone
1
DB Basics
1
K1/1
+375299876543
2
SQL Foundation
4
DM3
+375293456789
3
SQL for Analysis
4
K1/1
+375299876543
Address
Address
AddressPhone
K1/1
+375299876543
DM3
+375293456789

41.

Disadvantages of Normalization
Groups
Trainees
Performance
Modules
Address
GroupID
PassportID
PassportID
ModuleID
AddressID
GroupName
GroupID
ModuleID
Module_Name
Address
Name
Avg_Mark
Duration
Date_of_Birth
AddressID
Trainers
Prog_Experience
TrainerID
TrainerID
TrainerName
Email
A completely normalized database needs clear and broad understanding of the business
More tables to join: by spreading out your data into more tables, you increase the need to join tables
The data model is optimized for applications, not for Ad-hoc querying

42.

Denormalization
Modules
Trainees
PassportID
GroupID
Name
Date_of_Birth
Prog_Experience
Groups
GroupID
GroupName
Performance
ModuleID
PassportID
Module_Name
ModuleID
Duration
GroupID
AddressID
Address
TrainerID
AddressID
Avg_Mark
Address
Trainers
TrainerID
TrainerName
Email
To enhance query performance
To make a database more convenient
to manage
To facilitate and accelerate reporting
Does not mean not doing
normalization.
It is an optimization technique that is
applied after doing normalization

43.

Data Warehousing
Data
Data Warehouse
OLAP
OLTP
Extract
Clean
Transform
Load
Refresh
Analysis
DWH
Data Marts
External Data
English     Русский Правила