Database Design - Associations

Relationships modeled by UML Associations

In the enterprises being modeled by a database, objects interact with each other; we refer to such interactions as relationships. This article introduces relationships, how to model them in UML as associations, and how to implement them in the relational model.

Introduction

The UML association (ER term: relationship type) models the way that two classes are functionally connected to each other; it represents the possibility of a relationship between objects of the classes.

Example: We want our customers to be able to place orders for the products that we sell, so we need to model the Order class and its association with the Customer. Notice that while the Customer class represents a physical “thing,” the Order class represents an event that happens in the enterprise. Both are equally valid class types. We will first describe the Order.

An order is created when a customer decides to buy one or more of our products (in later sections we'll add products to our database).
We need to know when the order was placed (date and time), and which customer representative sold the order.

The association between the customer and the order will tell us which customer placed the order and correspondingly it will also tell us which orders were placed by a customer. We will describe the association in natural language just as we described the classes, but we will also include constraints about how few (at minimum) and how many (at maximum) individuals (objects) of one class may be connected to a single individual of the other class. This is called the multiplicity of the association (ER term: participation and cardinality of a relationship), and we describe it in both directions.

“Each customer places zero or more orders.” (The symbol * in the diagram below means “many”, and any quantity more than one is considered to be“many” in a database.)
“Each order is placed by one and only one customer.” (Bad English—passive voice—but makes sense!)

Every binary association is described by two sentences as listed above for the sample association. The sentences start with the word “each” as we consider just one object from the named class that is the subject of our sentence. The sentence then specifies the participation (0 for optional or 1 for mandatory) and cardinality (1 or many) constraints denoting number of objects from the other class that the one object can relate to. Another way of specifying the participation is with using the modal verbs “may” for optional participation and “must” for mandatory participation. The alternate versions of these sentences are listed below.

“Each customer may place up to many orders.” The verb “may” also reinforces the constraint that allows an object, in this case a customer, to exist in the database without being related to objects of the other class (i.e. without placing an order); that is, the relationship is optional.
“Each order must be placed by at most one customer.” In this case, the verb “must” reinforces that for an object to be in the database, it is required for it to participate in a relationship with an object from the other class.

Class diagram

In the diagram, the binary association is shown simply by a line connecting the two class types. It is named with a verb that describes the action; an arrow shows which way to read the verb; though, as explained above, there are in fact two reading directions. Symbols at each end of the line represent the multiplicity of the association, as described above.

Considering just the cardinality constraint, which is the maximum multiplicity at each end of the line (1 and * in the diagram above), this relationship is a one-to-many association. It's also more precisely referred to as a binary association as it models the relationships of objects from two classes. It's possible for associations to link objects from more than two classes; these are called n-ary associations.

The UML representation of the Order class contains only its own descriptive attributes. The UML association models the customer placing an order. In the database, we will need a different way (not just a line) to identify the customer; that will be part of the relation scheme (below).

Relation scheme diagram

As we learned in previous sections, a table in a database stores rows where each row represents a single object, thing or fact. A database is a collection of tables, each representing different objects (different due to the structure of each object). Here we will learn how to represent relationships among the objects being modeled.

The UML class diagram now has two classes linked by a binary association. This is the most basic of database models; in the relational model, it is implemented by starting with a relation scheme for each of the classes. In a previous section we have already seen the relation scheme corresponding to the Customer class. As we did before with mapping of the Customer class, the relation scheme for the Orders table contains all of the attributes from the class diagram. However, this is not sufficient.

We also need to represent the binary association “places” in the database; that is, we need to record which customer placed each order. In relational databases, we represent associations by data in the database. In this case, the orders table needs to have an attribute that holds the information about the customer who placed the order. We do this by copying the PK attributes of the Customer into the Orders scheme. The copied attributes are called a foreign key (FK), which is simply an image of the linked relation’s primary key.

Relation schema Customers and Orders with a relationship between them — Customer-order relation scheme diagram. Large image - Data dictionary (text)

Since we can’t have an order without a customer, we call Customers the parent scheme and Orders the child scheme in this association. The “one” side of an association is always the parent scheme, and provides the PK attributes to be copied. The “many” side of an association is always the child scheme, into which the FK attributes are copied. Memorize it: one, parent, PK; many, child, FK.

An FK might or might not become part of the PK of the child relation into which it is copied. In the example, it does, since we need to know both who placed an order and when the order was placed in order to identify it uniquely.

We can generalize the above to obtain a way to map a one-to-many binary association between two classes, call them class_R and class_S, where each class_R object can relate to many class_S objects and thus, each class_S object relates to at most one class_R object.

Map UML class_R to relation scheme R and determine its primary key PK_R
Map UML class_S to relation scheme S and determine its primary key (or partial PK)
In a one-to-many binary association, the parent class is the one whose objects can each relate to at most many objects of the other class. Thus, in this case, class_R is the parent class and class_S is the child class. After the previous steps, R is the parent relation scheme and S is the child relation scheme.
To map the one-to-many binary association between these two classes, copy the PK of the corresponding parent relation scheme and add it as attribute(s) of the child relation scheme. This means, we copy attributes PK_R to relation scheme S where these will be defined with a Foreign Key constraint.
If necessary, update the PK_S as needed so that it can uniquely identify rows in table S, as was done in the example above with the one-to-many relationship named places between customer and order.

Steps to map a one-to-many binary association to the relational model

Creating the child table

Since the Orders table has a FK to maintain the relationship, when we create this table we must define this as a constraint, the FK constraint. The Orders table is created after the Customers table since the PK must be defined in order for the FK to refer to it. Otherwise it is created in the same way as the Customers table.

CREATE TABLE orders (
   first_name VARCHAR(20),
   last_name  VARCHAR(20),
   phone      VARCHAR(20),
   order_date  TIMESTAMP, -- stores both date and time
   sold_by     VARCHAR(20),
   CONSTRAINT orders_pk PRIMARY KEY (first_name, last_name, phone, order_date),
   CONSTRAINT orders_customers_fk FOREIGN KEY (first_name, last_name, phone)
      REFERENCES customers (first_name, last_name, phone)
);

We can make the following observations about creating this table.

The FK attributes must be exactly the same data type and size as they were defined in the PK table.
The TIMESTAMP data type includes the date and time in a single value and is available in most database systems, but not in others (which would require an additional attribute, order_time, to permit more than one order per customer in a single day).
To insure that every row of the Orders table is unique, we need to know both who the customer is and what day (and time) the order was placed, thus the PK is the set of four attributes as specified above.
Finally, we need to identify which attributes make up the FK, and where they are found as a PK. The FK constraint will insure that every order contains a valid customer name and phone number; that is, each row in order must refer to a single row in the Customers table, this reference is done with the value for customer name and phone number. This constraint that every FK value must have a matching PK value in the table being referenced is called maintaining the referential integrity of the database.

When you look at some typical data in the Orders table, you will see that some customers have placed more than one order. For each of these, the same customer information is copied in the FK columns—but the dates will be different. Of course, we hope to see many orders that were placed on the same date—including different orders by the same customer. You will also see that some customers haven’t placed any orders at all; their PK information is simply not found in the orders table.

Customers and information about their orders

Alvaro	Monge	562-333-4141	2003-07-14	Patrick
Wayne	Dick	562-777-3030	2003-07-14	Patrick
Alvaro	Monge	562-333-4141	2003-07-18	Kathleen
Alvaro	Monge	562-333-4141	2003-07-20	Kathleen

Note: The date format shown in our examples ('yyyy-mm-dd') is used by many but not all systems. Consult the reference for your own software to be sure.

Basic structures: Database Relationships

Relationships modeled by UML Associations

Introduction

Class diagram

Relation scheme diagram

Creating the child table