4.00190. How do we modify a database?

SQL commands can be divided into two main sublanguages. The Data Definition Language (DDL) contains the commands used to create and destroy databases and database objects. These commands will primarily be used by database administrators during the setup and removal phases of a database project.

After the database structure is defined with DDL, database administrators and users can utilize the Data Manipulation Language to insert, retrieve and modify the data contained within it.

SQL-Transaction Statements control transactions in database access. This subset of SQL is also called the Data Control Language for SQL (SQL DCL).

There are 2 SQL-Transaction Statements:

COMMIT Statement -- commit (make persistent) all changes for the current transaction
ROLLBACK Statement -- roll back (rescind) all changes for the current transaction

A database transaction is a larger unit that frames multiple SQL statements. A transaction ensures that the action of the framed statements is atomic with respect to recovery.

A SQL Modification Statement has limited effect. A given statement can only directly modify the contents of a single table (Referential Integrity effects may cause indirect modification of other tables.) The upshot is that operations which require modification of several tables must involve multiple modification statements. A classic example is a bank operation that transfers funds from one type of account to another, requiring updates to 2 tables. Transactions provide a way to group these multiple statements in one atomic unit.

In SQL92, there is no BEGIN TRANSACTION statement. A transaction begins with the execution of a SQL-Data statement when there is no current transaction. All subsequent SQL-Data statements until COMMIT or ROLLBACK become part of the transaction. Execution of a COMMIT Statement or ROLLBACK Statement completes the current transaction. A subsequent SQL-Data statement starts a new transaction.

In terms of direct effect on the database, it is the SQL Modification Statements that are the main consideration since they change data. The total set of changes to the database by the modification statements in a transaction are treated as an atomic unit through the actions of the transaction. The set of changes either:

Is made fully persistent in the database through the action of the COMMIT Statement, or
Has no persistent effect whatever on the database, through:

the action of the ROLLBACK Statement,
abnormal termination of the client requesting the transaction, or
abnormal termination of the transaction by the DBMS. This may be an action by the system (deadlock resolution) or by an administrative agent, or it may be an abnormal termination of the DBMS itself. In the latter case, the DBMS must roll back any active transactions during recovery.

The DBMS must ensure that the effect of a transaction is not partial. All changes in a transaction must be made persistent, or no changes from the transaction must be made persistent.

In the relational model, each transaction is completely isolated from other active transactions. After initiation, a transaction can only see changes to the database made by transactions committed prior to starting the new transaction. Changes made by concurrent transactions are not seen by SQL DML query and modification statements. This is known as full isolation or Serializable transactions.

COMMIT Statement

The COMMIT Statement terminates the current transaction and makes all changes under the transaction persistent. It commits the changes to the database. The COMMIT statement has the following general format:

COMMIT [WORK]

WORK is an optional keyword that does not change the semantics of COMMIT.

ROLLBACK Statement

The ROLLBACK Statement terminates the current transaction and rescinds all changes made under the transaction. It rolls back the changes to the database. The ROLLBACK statement has the following general format:

ROLLBACK [WORK]

WORK is an optional keyword that does not change the semantics of ROLLBACK.

In SQL:

INSERT Statement -- add rows to tables
UPDATE Statement -- modify columns in table rows
DELETE Statement -- remove rows from tables

Data Definition Language:

Alter

Once you've created a table within a database, you may wish to modify the definition of it. The ALTER command allows you to make changes to the structure of a table without deleting and recreating it. Take a look at the following command:

ALTER TABLE personal_info
ADD salary money null

This example adds a new attribute to the personal_info table -- an employee's salary. The "money" argument specifies that an employee's salary will be stored using a dollars and cents format. Finally, the "null" keyword tells the database that it's OK for this field to contain no value for any given employee.

DROP

The final command of the Data Definition Language, DROP, allows us to remove entire database objects from our DBMS. For example, if we want to permanently remove the personal_info table that we created, we'd use the following command:

DROP TABLE personal_info

Similarly, the command below would be used to remove the entire employees database:

DROP DATABASE employees

Use this command with care! Remember that the DROP command removes entire data structures from your database. If you want to remove individual records, use the DELETE command of the Data Manipulation Language.

Data Manipulation Language:

The Data Manipulation Language (DML) is used to retrieve, insert and modify database information. These commands will be used by all database users during the routine operation of the database. Let's take a brief look at the basic DML commands:

INSERT

The INSERT command in SQL is used to add records to an existing table. Returning to the personal_info example from the previous section, let's imagine that our HR department needs to add a new employee to their database. They could use a command similar to the one shown below:

INSERT INTO personal_info
values('bart','simpson',12345,$45000)

Note that there are four values specified for the record. These correspond to the table attributes in the order they were defined: first_name, last_name, employee_id, and salary.

Another way to explain the same concept:

Insertion is expressed by:

insert into
values

The attributes of relation r must be , and must be a value of the domain of . The attributes and values need not be listed in the order of the attributes of r.

Example - If I bought an Orvis 9 weight (flyrod-stock-num of 2) on 10/23/95, either of the following insert statements will add the appropriate tuple to the purchased relation:

insert into purchased (cust-num, flyrod-stock-num, date)
values (1, 2, 10/23/95)

insert into purchased (flyrod-stock-num, date, cust-num)
values (2, 10/23/95, 1)

If the attribute values in the values clause are ordered in the same order as the attributes of the relation scheme, the attribute list in the insert clause can be omitted.

Example - since Purchased-scheme = (cust-num, flyrod-stock-num, date), the following is equivalent to the previous insert statements:

insert into purchased
values (1, 2, 10/23/95)

The values can be replaced with a select statement to insert the result of a query into a relation.

Example - insert all employees into the customer relation, using their social-security-num as their cust-num.

insert into customer
select name, address, social-security-num
from employee

This adds an extra tuple for Elvis (with a different cust-num). Thus, Elvis now has 2 cust-num's.

Fix:

insert into customer
select name, address, social-security-num
from employee
where (name, address) not in
(select name, address
from customer)

Old versions of SQL would not allow such an insert statement, because the same relation appears in the insert clause and a nested query.

Newer versions permit such insert statements, using a semantics like that used for delete statements.

the set of tuples to insert is calculated first
all insertions occur after this calculation

This avoids the apparent contradiction in the previous example.

The insert statement can insert null values into a relation. This can be useful if partial information needs to be stored, but causes many problems.

Example - to add the Winston flyrod with unknown length to the flyrod relation:

insert into flyrod
values ("Winston", null, 7, 4)

The following is another explanation for INSERT:

The INSERT Statement adds one or more rows to a table. It has two formats:

INSERT INTO table-1 [(column-list)] VALUES (value-list)

and,

INSERT INTO table-1 [(column-list)] (query-specification)

The first form inserts a single row into table-1 and explicitly specifies the column values for the row. The second form uses the result of query-specification to insert one or more rows into table-1. The result rows from the query are the rows added to the insert table. Note: the query cannot reference table-1.

Both forms have an optional column-list specification. Only the columns listed will be assigned values. Unlisted columns are set to null, so unlisted columns must allow nulls. The values from the VALUES Clause (first form) or the columns from the query-specification rows (second form) are assigned to the corresponding column in column-list in order.

If the optional column-list is missing, the default column list is substituted. The default column list contains all columns in table-1 in the order they were declared in CREATE TABLE, or CREATE VIEW.

VALUES Clause

The VALUES Clause in the INSERT Statement provides a set of values to place in the columns of a new row. It has the following general format:

VALUES ( value-1 [, value-2] ... )

value-1 and value-2 are Literal Values or Scalar Expressions involving literals. They can also specify NULL.

The values list in the VALUES clause must match the explicit or implicit column list for INSERT in degree (number of items). They must also match the data type of corresponding column or be convertible to that data type.

INSERT Examples

INSERT INTO p (pno, color) VALUES ('P4', 'Brown')

Before

After

pno	descr	color
P1	Widget	Blue
P2	Widget	Red
P3	Dongle	Green

=>

pno	descr	color
P1	Widget	Blue
P2	Widget	Red
P3	Dongle	Green
P4	NULL	Brown

INSERT INTO sp

SELECT s.sno, p.pno, 500

FROM s, p

WHERE p.color='Green' AND s.city='London'

Before

After

sno	pno	qty
S1	P1	NULL
S2	P1	200
S3	P1	1000
S3	P2	200

=>

sno	pno	qty
S1	P1	NULL
S2	P1	200
S3	P1	1000
S3	P2	200
S2	P3	500

SELECT

The SELECT command is the most commonly used command in SQL. It allows database users to retrieve the specific information they desire from an operational database. Let's take a look at a few examples, again using the personal_info table from our employees database.

The command shown below retrieves all of the information contained within the personal_info table. Note that the asterisk is used as a wildcard in SQL. This literally means "Select everything from the personal_info table."

SELECT *
FROM personal_info

Alternatively, users may want to limit the attributes that are retrieved from the database. For example, the Human Resources department may require a list of the last names of all employees in the company. The following SQL command would retrieve only that information:

SELECT last_name
FROM personal_info

Finally, the WHERE clause can be used to limit the records that are retrieved to those that meet specified criteria. The CEO might be interested in reviewing the personnel records of all highly paid employees. The following command retrieves all of the data contained within personal_info for records that have a salary value greater than $50,000:

SELECT *
FROM personal_info
WHERE salary > $50000

UPDATE

The UPDATE command can be used to modify information contained within a table, either in bulk or individually. Each year, our company gives all employees a 3% cost-of-living increase in their salary. The following SQL command could be used to quickly apply this to all of the employees stored in the database:

UPDATE personal_info
SET salary = salary * 1.03

On the other hand, our new employee Bart Simpson has demonstrated performance above and beyond the call of duty. Management wishes to recognize his stellar accomplishments with a $5,000 raise. The WHERE clause could be used to single out Bart for this raise:

UPDATE personal_info
SET salary = salary + $5000
WHERE employee_id = 12345

An update statement is used to modify multiple tuples in the database at one time.

Example - suppose that all employees who make more than $20000 get a 5% raise:

update employee
set salary = salary * 1.05
where salary > 20000

The order of updates can be significant. Suppose that all employees who make more than $20000 get a 5% raise, and those who make less than or exactly $20000 get a 6% raise. The sequence of updates:

update employee
set salary = salary * 1.06
where salary <= 20000

update employee
set salary = salary * 1.05
where salary > 20000

gives employees who were making slightly less than $20000 an 11.3% raise! Reversing the order of the updates fixes the problem.

The where clause of an update statement can contain a nested query, and the nested query can reference the relation being updated.

Example - give all employees making less than the average salary a 5% raise:

update employee
set salary = salary * 1.05
where salary <
(select avg(salary)
from employee)

Another way of explaining the same concept:

The UPDATE statement modifies columns in selected table rows. It has the following general format:

UPDATE table-1 SET set-list [WHERE predicate]

The optional WHERE Clause has the same format as in the SELECT Statement. See WHERE Clause. The WHERE clause chooses which table rows to update. If it is missing, all rows are in table-1 are updated.

The set-list contains assignments of new values for selected columns. See SET Clause.

The SET Clause expressions and WHERE Clause predicate can contain subqueries, but the subqueries cannot reference table-1. This prevents situations where results are dependent on the order of processing.

SET Clause

The SET Clause in the UPDATE Statement updates (assigns new value to) columns in the selected table rows. It has the following general format:

SET column-1 = value-1 [, column-2 = value-2] ...

column-1 and column-2 are columns in the Update table. value-1 and value-2 are expressions that can reference columns from the update table. They also can be the keyword -- NULL, to set the column to null.

Since the assignment expressions can reference columns from the current row, the expressions are evaluated first. After the values of all Set expressions have been computed, they are then assigned to the referenced columns. This avoids results dependent on the order of processing.

UPDATE Examples

UPDATE sp SET qty = qty + 20

Before

After

sno	pno	qty
S1	P1	NULL
S2	P1	200
S3	P1	1000
S3	P2	200

=>

sno	pno	qty
S1	P1	NULL
S2	P1	220
S3	P1	1020
S3	P2	220

UPDATE s

SET name = 'Tony', city = 'Milan'

WHERE sno = 'S3'

Before

After

sno	name	city
S1	Pierre	Paris
S2	John	London
S3	Mario	Rome

=>

sno	name	city
S1	Pierre	Paris
S2	John	London
S3	Tony	Milan

DELETE

Finally, let's take a look at the DELETE command. You'll find that the syntax of this command is similar to that of the other DML commands. Unfortunately, our latest corporate earnings report didn't quite meet expectations and poor Bart has been laid off. The DELETE command with a WHERE clause can be used to remove his record from the personal_info table:

DELETE FROM personal_info
WHERE employee_id = 12345

DELETE:

Deletion is expressed by:

delete from r
where P

which deletes all tuples from relation r that satisfy predicate P.

Example - delete Elvis Presley from the customer relation:

delete from customer
where name = "Elvis Presley"

As with the select clause, an omitted where clause defaults to where true

Example - delete all tuples from the customer relation:

delete from customer

This does not delete the customer relation.

The where clause can refer to other relations, or even use a nested query.

Example - delete all customers who haven't purchased a flyrod over 8.0 feet long:

delete from customer
where cust-num not in
(select cust-num
from purchased, flyrod
where purchased.flyrod-stock-num = flyrod.flyrod-stock-num and length > 8.0)

In older versions of SQL, the relation in the delete clause could not appear in the from clause of an embedded query. Newer versions adopt a semantics that avoids problems with such delete statements.

Example - delete all employees who make more than the average salary:

delete from employee
where salary >
(select avg(salary)
from employee)

Note that the average would change while the deletion was occurring. To fix this problem, modern versions of SQL do the following:

evaluate the where clause for each tuple of the relation being deleted from. No tuples are deleted in this step, but the result of evaluating the where clause is stored for each tuple.
after the where clause is evaluated for each tuple, all tuples that satisfied the where clause are deleted.

That is, SQL marks tuples for deletion first, and then deletes all tuples. This guarantees that the where clause is always evaluated over the value of the relation before any deletions occur.

The following is another explanation.

The DELETE Statement removes selected rows from a table. It has the following general format:

DELETE FROM table-1 [WHERE predicate]

The optional WHERE Clause has the same format as in the SELECT Statement. See WHERE Clause. The WHERE clause chooses which table rows to delete. If it is missing, all rows are in table-1 are removed.

The WHERE Clause predicate can contain subqueries, but the subqueries cannot reference table-1. This prevents situations where results are dependent on the order of processing.

DELETE Examples

DELETE FROM sp WHERE pno = 'P1'

Before

After

sno	pno	qty
S1	P1	NULL
S2	P1	200
S3	P1	1000
S3	P2	200

=>

sno	pno	qty
S3	P2	200

DELETE FROM p WHERE pno NOT IN (SELECT pno FROM sp)

Before

After

pno	descr	color
P1	Widget	Blue
P2	Widget	Red
P3	Dongle	Green

=>

pno	descr	color
P1	Widget	Blue
P2	Widget	Red

Views:

Review: a view is a "subset" of the conceptual level of a DBMS.

In the relational model, views are "virtual relations". Views can be used anywhere relations can be used, but are not actually stored in the database.

Common reasons for using views:

to restrict access to information
to provide a more convenient interface to certain information.

Example (restricted access): suppose the flyshop hires Theodore Kaczynski on a prison work release program. He should NOT have access to customer and employee addresses.

Example (convenient interface): a view that gives both customer and employee addresses for sending out the newsletter, etc.

In SQL, a view is defined as follows:

create view view-name as SQL-query

where:

view-name is the name of the new view
SQL-query is any legal SQL query

Examples:

the views for Ted Kaczynski:

create view customer-Ted as
select name, cust-num
from customer

create view employee-Ted as
select name, social-security-num, salary
from employee

the newsletter view:

create view newsletter as
(select name, address
from customer)
union
(select name, address
from employee)

Ted should not have access to the newsletter view.

Because a view can change whenever one of the relations it depends on changes, views are usually not stored as relations in the database. Instead, the SQL query itself is stored and used to compute the view whenever it is referenced.

A view name can be used anywhere a relation name can be used.

Example - find the name of everyone receiving the newsletter:

select name
from newsletter

To execute this query, the query defining newsletter is executed, and then name is selected from that result relation.

A view definition is deleted by name using the following command:

drop view view-name

SQL has 3 basic categories of statements:

· SQL-Data Statements -- query and modify tables and columns

o SELECT Statement -- query tables and views in the database

o INSERT Statement -- add rows to tables

o UPDATE Statement -- modify columns in table rows

o DELETE Statement -- remove rows from tables