Retrieving data from one table
Retrieval with SQL
In SQL, to retrieve data stored in our tables, we use the SELECT statement. The result of this statement is always in the form of a table that we can view with our database client software or use with programming languages to build dynamic web pages or desktop applications. While the result may look like a table, it is not stored in the database like the named tables are. The result of a SELECT statement can also be used as part of another statement.
Basic syntax of SELECT
statement
The basic syntax consists of four clauses as shown in the figure below. While SQL is not case sensitive, by convention many database developers use uppercase for keywords to improve readability.
SELECT {attribute}+
FROM {table}+
[ WHERE {boolean predicate to pick rows} ]
[ ORDER BY {attribute}+ ];
SELECT
statement.Of the four clauses, only the first two are required. The two shown in square brackets are optional. When you start learning to build queries, it is helpful to follow a specific step-by-step sequence, look at the data after each modification to the query, and be sure that you understand the results at each step. This iterative refinement will allow you to hone in on just the right SQL statement to retrieve the desired information. Below is a summary of the clauses.
- The
SELECT
clause allows us to specify a comma-separated list of attribute names corresponding to the columns that are to be retrieved. You can use an asterisk character, *, to retrieve all the columns. - In queries where all the data is found in one table, the
FROM
clause is where we specify the name of the table from which to retrieve rows. In other articles we will use it to retrieve rows from multiple tables. - The
WHERE
clause is used to constrain which rows to retrieve. We do this by specifying a boolean predicate that compares the values of table columns to literal values or to other columns. - The
ORDER BY
clause gives us a way to order the display of the rows in the result of the statement.
The example of the next section provides more information on how to retrieve
information using this SELECT
statement.
SQL Example: customers in a specified zip code
We’ll build a list of customers who live in a specific zip code area, showing their first and last names and phone numbers and listing them in alphabetical order by last name. A company might want to do this to initiate a marketing campaign to customers in this area. In this example, we’ll use zip code 90840. Listed below are the refinement steps we take to arrive at the statement that will retrieve what we need.
- Start by retrieving all of the relevant data; in this case, that
is all data of every customer. In our database all of this is stored in
only one table, so that table is specified in the FROM clause. Since we
want to retrieve all columns from this table, instead of naming each of
them individually, we can use the abbreviation symbol * to indicate that
all columns are to be retrieved. That completes the recipe for our
SQL statement which is shown below; note, we have no use for the two
optional clauses in this initial statement. In the same figure below,
you will also find the result of this query executed on a tiny database.
SELECT * FROM customers;
Customers Tom Jewett 714-555-1212 10200 Slater 92708 Alvaro Monge 562-333-4141 2145 Main 90840 Wayne Dick 562-777-3030 1250 Bellflower 90840 SQL statement to retrieve all customers and the result set - Clearly we need to a refinement step as the query retrieves all customers
while we are only interested in customers who live in zip code 90840.
We need to specify in the statement that the only rows to retrieve from
the database are those that meet this criteria. Such qualifying criteria
is specified in the
WHERE
clause using boolean expressions. Our first statement is thus refined as shown in the figure below.SELECT * FROM customers WHERE cZipCode = '90840';
Customers in zip code 90840 Alvaro Monge 562-333-4141 2145 Main 90840 Wayne Dick 562-777-3030 1250 Bellflower 90840 Refinement #2 to retrieve desired customers. '90840'
. While not illustrated in this example and unlike SQL keywords, literal strings and strings stored in the database are case sensitive; thus,'Long Beach'
is a different string than'long beach'
. - We need just a couple of more refinements. While we now are retrieving
only the customers we desire, we are also retrieving every column
from the table yet, not all are needed. We need a way to
pick the attributes (columns) we want. This is done by listing them
in the
SELECT
clause, each column name separated by a comma. The figure below shows this refinement and its corresponding result set.SELECT cLastName, cFirstName, cPhone FROM customers WHERE cZipCode = '90840';
Columns from SELECT Monge Alvaro 562-333-4141 Dick Wayne 562-777-3030 Refinement #3 to retrieve specific columns. - For practical purposes our last refinement is all that we need. To make
the result set more appealing to a human, we may want to order the result set.
Imagine having a result set that is 100 times of what we are showing here!
It would be better to display the result sorted alphabetically by the name
of the customer. In SQL, you can use the
ORDER BY
clause to specify the order in which to retrieve the results. Once again, this ordering does not change the meaning of the results; the result set does not change, all it changes is the order in which the rows are displayed. This final refinement and its result are shown below.SELECT cLastName, cFirstName, cPhone FROM customers WHERE cZipCode = '90840' ORDER BY cLastName ASC, cFirstName ASC;
Rows in order Dick Wayne 562-777-3030 Monge Alvaro 562-333-4141 Refinement #4 to order the rows in the result. ASC
is used to order the rows in ascending values, which is the default ordering so the keyword is not necessary and is shown here for completeness. To order rows in descending values, use the keywordDESC
. In the statement above, rows are first ordered in ascending value of the last name and in case of ties (two or more customers with the same name), then the rows are ordered in ascending value of the first name.
Retrieval with relational algebra
SQL is a declarative language. As such, SQL is used to declare what is to be retrieved from the database. In our SQL statement above, we did not specify how to retrieve the result. In an imperative language, we do specify the steps to take to solve a problem, such as how to retrieve a result from a database. Thus, it is the responsibility of the database system to determine how to retrieve what is declared in SQL. In relational database systems, this is commonly done by translating SQL into Relational Algebra.
Like all algebras, RA applies operators to operands to produce results of the same type as the operands. RA operands are relations and thus the results are also relations. Furthermore, like all algebras, the results of operators can be used as operands in building more complex expressions. We introduce two of the RA operators following the example and refinements above for SQL.
RA operators: σ and π
To retrieve a single relation in RA, we only need to use its name. The common notation in the relational model is to use uppercase letters for relation scheme (R, S, T, U, etc) and lowercase letters for relations (r, s, t, u, etc). Thus, the simplest RA expression is to retrieve all columns and every row of a relation is just the name of the relation: r
The two RA operators introduced here are σ, the select operator, and π, the project operator.
- The select (RA) operator specified by the symbol σ
picks tuples that satisfy a predicate; thus, serving a similar purpose
as the SQL
WHERE
clause. This RA select operator σ is unary taking a single relation or RA expression as its operand. The predicate, θ, to specify which tuples are required is written as a subscript of the operator, giving the syntax ofσθe
, where e is a RA expression.The scheme of the result of
σθr
is R—the same scheme we started with—since the entire tuple is selected, as long as the tuple satisfies the predicate. The result of this operation includes all tuples of relation r that satisfy the predicate θ—that is, θ evaluates to true. - The project (RA) operator specified by the symbol π
picks attributes, confusingly like the SQL
SELECT
clause. It is also a unary operator that takes a single relation or expression as its operand and the attributes to retrieve are specified as a a subscheme, X (subset of its operand). The syntax isπXe
where, as before, e is a RA expression. Following are additional properties of the project operator.- For X to be a subscheme of R, it must be a subset of the attributes in R that preserves the assignment rule from R (that is, each attribute of X must have the same domain as its corresponding attribute in R).
- The scheme of the result of πXr is X. The tuples resulting from this operation are tuples of the original relation, r, but cut down to the attributes contained in X.
- If X is a super key of r, then there will be the same
number of tuples in the result as there were to begin with in r.
If X is not a
super key of r, then any duplicate (non-distinct) tuples are eliminated
from the result, ensuring the result is always a set. This is unlike SQL
where the result of a
SELECT
statement with aWHERE
clause is a multiset.
- As with other algebras, we can use
function composition
by applying the project operator to the
result of the select operator from the previous set to get:
πXσθr
RA Example: customers in a specified zip code
Given the above RA syntax, we can now use RA to create expressions that match the SQL statements from above which retrieve the customers who live in zip code 90840.
- The first step is to retrieve all customers. This is done by a RA expression that
consists of just the name of the relation, thus the RA expression
customers
is the equivalent of the first SQL statement above. Its scheme is the same as the Customers scheme. - To retrieve the equivalent
result set as the SQL statement in refinement #2,
we apply the σ operator to the result set of our previous expression:
Again, the scheme of the result set is the same as the Customers scheme.σcZipCode='90840'customers
- Now, applying function composition here, we can retrieve just the columns we
desire from the result set of the previous expression to get the RA expression
that retrieves the equivalent
result set as the SQL statement in refinement #3:
πcLastName, cFirstName, cPhone σcZipCode='90840'customers
- Note that, in RA the results of expressions are strictly sets of tuples,
thus, there is no way to specify the order of tuples in a result set. This is
unlike SQL and its
ORDER BY
caluse.