Basic structures: tables and rows

Database tables, rows, keys

Representing data in rows

Each object, i.e., a real-world individual of a class (for example, each customer who does business with our enterprise), is represented by a row of information in a database table. The row is defined in the relational model as a tuple that is constructed over a given scheme. Mathematically, the tuple is a function that assigns a constant value to each attribute of the scheme from its corresponding attribute domain. Notice that because the scheme is a set of attributes, we could show them in any order without changing the meaning of the data in the row (tuple).

Customers row or tuple
A customer tuple. Other views of this diagram: Large image - Description (text)

In formal notation, we could show the assignments explicitly, where the identifier t represents a tuple: tTJ = ‹first_name := 'Tom', last_name := 'Jewett', phone := '714-555-1212', street := '10200 Slater', zipcode := '92708'›

Inserting rows

In practice, when we create a table row in SQL, we are actually making the assignment of domain values to attributes, just as in the tuple definition.

INSERT INTO customers(first_name, last_name, phone, street, zipcode)
  VALUES ('Tom', 'Jewett', '714-555-1212', '10200 Slater', '92708');
SQL statement to insert one row with values for all columns.

In SQL, you can omit the attribute names from the INSERT INTO statement, as long as you keep the comma-delimited list of values in exactly the same order as was used to create the table. This syntax is more prone to errors, so use sparingly. It is provided for completion. Removing the attribute names from the above statement, gives the SQL statement below.

INSERT INTO customers
  VALUES ('Tom', 'Jewett', '714-555-1212', '10200 Slater', '92708');
SQL statement to insert one row with values for all columns in the order specified when the table was created.

It is also possible to list only some of the attribute names, the ones we provide values for; in that case, the remaining attributes will be assigned NULL in such rows, assuming there is no constraint prohibiting NULLs in such columns. In the case of the customers table, it was created with NOT NULL constraint on the name and phone number columns, so non-NULL values for these must be inserted. On the other hand, there is no such constraint on the other columns, so NULL is allowed.

INSERT INTO customers(first_name, last_name, phone)
  VALUES ('Alvaro', 'Monge', '562-985-4671'),
         ('Wayne', 'Dick', '562-985-1190');
SQL statement to insert two rows, omitting values for some columns.

Updating rows

The UPDATE statement is used to modify the value for one or more attributes in the specified rows of a table. When we change the data in a table row using SQL, we are also following the tuple definition of assigning domain values to attributes.

The statement below will modify all rows in table customers that have a phone number of '714-555-1212' and change them to have the phone number specified in the SET clause.

UPDATE customers
  SET phone = '714-555-2323'
  WHERE phone = '714-555-1212';
SQL statement to update a phone number in the table.

Deleting rows

The SQL DELETE statement is used to delete rows. When deleting rows, you can specify the rows to delete via an optional condition. The following figures provide two examples.

The statement below will delete all rows in table customers that have a value of '90840' in the cZipCode column.

DELETE FROM customers
  WHERE cZipCode = '90840';
SQL statement to delete rows from a table.

The statement below will delete all rows in table customers; in this case, no criteria is specified in a WHERE clause, so all rows are deleted. Beware running such statements as you will lose all rows in the table.

DELETE FROM customers;
SQL statement to delete all rows from a table.

SQL Tables with constraints

A database table is simply a collection of zero or more rows. This follows from the relational model definition of a relation as a set of tuples over the same scheme. (The name “relational model” comes from the relation being the central object in this model.)

Customers table or relation
The customers database table or relation. Other views of this diagram: Large image - Description (text)

Knowing that the relation (table) is a set of tuples (rows) tells us more about this structure, as we saw with schemes and domains.

Insuring unique rows

Since each tuple in a relation must be unique, no two tuples can have exactly the same values for every one of their attributes, that is, there can be no duplicate tuples in a relation. Unfortunately, the same cannot be said about SQL tables. A SQL table is bag (i.e., a multiset) of rows, unless constraints are placed on the table to ensure there be no duplicate rows. Thus, to implement a relation as a SQL table, there must be some set of attributes in each relation whose values, taken together, guarantee uniqueness of each row. Any set of attributes that can do this is called a super key (SK). By the definition of a relation as a set of tuples, the set of all attributes must be a super key. If such a set were not a super key, it would allow two or more identical tuples in the relation which would violate the definition of a set. Since super keys are constraints on the data, they must be true for any relation (table) of a relation scheme, thus these super keys are shown in the relation scheme diagram. A super key is our first database constraint, we will learn more of them throughout.

The set of all attributes in a relation scheme R is just one super key of that scheme, there can be and usually there are more. The other super keys are proper subsets of the relation scheme. Out of all these super keys, the database designer picks one to serve as the primary key (PK) of the relation. (Notice that the PK is a SK, but not all SKs are PKs, since only one is chosen as a PK!) The PK is sometimes also called a unique identifier for each row of the table. This is not an arbitrary choice—we’ll discuss it in detail on a later page. For our customers table, we’ll pick the set {first_name, last_name, phone}. We are likely to have at least two customers with the same first and last name, but it is very unlikely that they will both have the same phone number.

In SQL, we specify the primary key of a table with a data constraint that lists the attributes that form the PK. We also give the constraint a name that is easy for us to remember later (as is done below using “customers_pk”).

ALTER TABLE customers
  ADD CONSTRAINT customers_pk PRIMARY KEY (first_name, last_name, phone);
SQL statement to alter a table by adding a PK

Since every SQL table must have a PK in order for it to correctly represent a relation, it is best to avoid altering a table to add a PK constraint and instead specify such constraints at the time the table is created. In creating the customers table, you may have noticed the use of NOT NULL for some of the attributes. NULL is a special constant in database systems (as is the case in most programming languages) that means “this field doesn’t have any value assigned to it. ” It’s not the same as a zero length string or the number zero. Thus, NOT NULL is a database constraint that prevents rows from being inserted without a value for attributes with this constraint (similarly also prevents updates that would assign NULL to such attributes).

The statement below recreates the table with the required PK constraint. If you’ve previously created the table, you can use the DROP TABLE statement to delete the table from the database, including of course all data that may be in the table.

CREATE TABLE customers (
   first_name VARCHAR(20) NOT NULL,
   last_name  VARCHAR(20) NOT NULL,
   phone      VARCHAR(20) NOT NULL,
   street     VARCHAR(50),
   zipcode    VARCHAR(5),
   CONSTRAINT customers_pk PRIMARY KEY (first_name, last_name, phone));
SQL statement to create Customers table, including its PK.