SQL technique: union and minus
Set operations on tables
Some students initially think of the join as being a sort of union between two tables. It’s not (except for the schemes). The join pairs up data from two very different tables. In RA and SQL, union can operate only on two identical tables. Remember the Venn-diagram representation of the union and minus operations on sets. Union includes members of either or both sets (with no duplicates). Minus includes only those members of the set on the left side of the expression that are not contained in the set on the right side of the expression.
Union
For this example, we will add a Suppliers table to our sales data entry model. “A supplier is a company from which we purchase products that we will re-sell.” Each supplier suppliers zero to many products; each product is supplied by one and only one supplier. The supplier class attributes include the company name and address, plus the name and phone number of the representative from whom we make our purchases.
SELECT cLastName AS "Last Name", cFirstName AS "First Name",
cPhone as "Phone", 'Customer' AS "Company"
FROM customers
UNION
SELECT repLName, repFName, repPhone, sCompanyName
FROM suppliers
ORDER BY "Last Name";
Bradley | Jerry | 888-736-8000 | Industrial Tool Supply |
Dick | Wayne | 562-777-3030 | Customer |
Jewett | Tom | 714-555-1212 | Customer |
Monge | Alvaro | 562-333-4141 | Customer |
O'Brien | Tom | 949-567-2312 | Bosch Machine Tools |
Minus
Sometimes you have to think about both what you do want and what you don’t want in the results of a query. If there is a WHERE clause predicate that completely partitions all rows of interest (the result set) into those you want and those you don’t want, then you have a simple query with a test for inequality.
SELECT prodName, sCompanyName
FROM Products NATURAL JOIN Suppliers
WHERE sCompanyName <> 'Industrial Tool Supply';
1. A customer made purchases in 2002 (only).
2. A customer made purchases in other years, but not in 2002.
3. A customer made purchases both in other years and in 2002.
4. A customer made no purchases in any year.
SELECT DISTINCT cLastName, cFirstName, cStreet, cZipCode
FROM Customers NATURAL JOIN Orders
WHERE TO_CHAR(orderDate, 'YYYY') <> '2002';
you will correctly exclude group 1 and include group 2, but falsely include group 3 and falsely exclude group 4. Please take time to re-read this statement and convince yourself why it is true!
{customers who did not make purchases in 2002} = {all customers} − {those who did}
There are two ways to write this in SQL.
SELECT cLastName, cFirstName, cStreet, cZipCode
FROM Customers
WHERE custID NOT IN
(SELECT custID
FROM Orders
WHERE TO_CHAR(orderDate, 'YYYY') = '2002');
SELECT cLastName, cFirstName, cStreet, cZipCode
FROM Customers
MINUS
SELECT cLastName, cFirstName, cStreet, cZipCode
FROM Customers NATURAL JOIN Orders
WHERE TO_CHAR(orderDate, 'YYYY') = '2002';
Other set operations
SQL has two additional set operators. UNION ALL works like UNION, except it keeps duplicate rows in the result. INTERSECT operates just like you would expect from set theory; again, the schemes of the two tables must match exactly.