# Basic query operation: the join

## SQL syntax

In most queries, we will want to see data from two or more tables.
To do this, we need to **join** the tables in a way that matches up
the right information from each one to the other—in our example, listing all of the
customer data along with all of the order data for the orders that each customer
has placed. In the latest SQL standard, the join is specified in the FROM clause:

```
SELECT *
FROM customers
NATURAL JOIN orders;
```

cfirstname | clastname | cphone | cstreet | czipcode | orderdate | soldby |
---|---|---|---|---|---|---|

Alvaro | Monge | 562-333-4141 | 2145 Main | 90840 | 2003-07-14 | Patrick |

Wayne | Dick | 562-777-3030 | 1250 Bellflower | 90840 | 2003-07-14 | Patrick |

Alvaro | Monge | 562-333-4141 | 2145 Main | 90840 | 2003-07-18 | Kathleen |

Alvaro | Monge | 562-333-4141 | 2145 Main | 90840 | 2003-07-20 | Kathleen |

**join attributes**that are matched—here {cfirstname, clastname, cphone}—are shown only once in the result, along with all other attributes of both tables.

## How it works

The easiest way to understand the join is to think of the database software looking one-by-one at each pair of rows from the two tables.

cfirstname | clastname | cphone | cstreet | czipcode |
---|---|---|---|---|

Tom | Jewett | 714-555-1212 | 10200 Slater | 92708 |

Alvaro | Monge | 562-333-4141 | 2145 Main | 90840 |

Wayne | Dick | 562-777-3030 | 1250 Bellflower | 90840 |

cfirstname | clastname | cphone | orderdate | soldby |
---|---|---|---|---|

Alvaro | Monge | 562-333-4141 | 2003-07-14 | Patrick |

Wayne | Dick | 562-777-3030 | 2003-07-14 | Patrick |

Alvaro | Monge | 562-333-4141 | 2003-07-18 | Kathleen |

Alvaro | Monge | 562-333-4141 | 2003-07-20 | Kathleen |

## RA syntax

The **RA join** of two relations, *r* over scheme R and *s*
over scheme S, is written *r* ⋈ *s*, or in our example,
*customers* ⋈ *orders*.
The scheme of the result,
exactly as you have seen in the SQL syntax, is the union of the two relation
schemes, R∪S. The join attributes are found in the intersection of the two
schemes, R∩S. Clearly, the intersection attributes must inherit the same assignment
rule from R and S; this makes the two schemes **compatible**.

**paste**of all tuples from the two relations, written paste(

*t,u*) for any tuple

*t*from relation

*r*over scheme R and

*u*from relation

*s*over scheme S. The result of the paste operation is exactly as explained in the preceeding section.

*r*is pasted to every tuple from relation

*s*, with all of the attributes from both schemes contained in the resulting tuples. In set theory, this is a

**Cartesian product**of the two relations; in practice, it is almost always nonsense and not what you want. The Cartesian product can also be written in RA as

*r*×

*s*, or intentionally specified in SQL with the CROSS JOIN keyword.