Design pattern: many-to-many with history (the library loan)

Many-to-many associations with history

Introduction: limits of UML associations

All UML binary associations have a constraint that the same two objects can be in an association with each other at most once. That is, the relationship instances form a set of pairs of objects; since the nature of a set prevents duplicates, then the same pair of objects is not allowed. This is limiting, as there are situations when two objects could relate to each other two or more times. In this case, more information is needed to differentiate between the many associations of the same two objects. Since the relationship instances no longer form a set we cannot use binary associations in our model. This article discusses how the many-to-many association is not able to model these relationships and provides a solution to the problem.

Before presenting the solution, we need to clarify a common misconception. We cannot solve this problem by just adding an association class to the binary association. Recall that the UML association class represents attributes that further describe each of the allowed pairs in the many-to-many association. Since the many-to-many association disallows pairs that link the same two objects, the UML association class cannot be used to address this problem. Thus, the many-to-many association is the artifact modeling the constraints of the relationship; the association class just allows us to model more information about the association instances. In the order entry pattern example, this means that in one order there can be only one order line (i.e. order detail) for each item ordered. In that case, the use of the many-to-many association and the association class are consistent with the enterprise being modeled. To summarize, a many-to-many association prohibits the same two objects to relate to each other more than once, even in the presence of an association class.

There are times when we need to allow the same two individuals to be paired more than once. In such situations, we need to model the association in a different way and not by a many-to-many association since it prohibits such pairings. This frequently happens when we need to keep a history of events over time.

Example: Loans of books by a library

In a library, customers can borrow many books and each book can be borrowed by many customers. At first glance, this seems to be a simple many-to-many association between customers and books. But any one customer may borrow a book, return it, and then borrow the same book again at a later time. The library records each book loan separately and needs to track all loans. This record keeping is crucial in public (or private) libraries where they need to demonstrate demand for their services to continue their funding (or even request an increase). There is no invoice for each set of borrowed books and therefore no equivalent here of the Order in the order entry example. (You have already seen other parts of the library model in exercises.)

The loan is an event that happens in the real world; we need a regular class to model it correctly. We’ll call this the “library loan” design pattern. First, we need to understand what the classes and associations mean; below is a list of descriptions that might have been provided to us by librarians.

  • A customer is any person who has registered with the library and is eligible to check out books.
  • A catalog entry is essentially the same as an old-fashioned index card that represents the title and other information about books in the library, and allows the customers to quickly find a book on the shelves.
  • A book-on-the-shelf is the physical volume that is either sitting on the library shelves or is checked out by a customer. There can be many physical books represented by any one catalog entry.
  • A loan event happens when one customer takes one book to the checkout counter, has the book and her library card scanned, and then takes the book home to read.

Class diagram of library loans

This “library loan” design pattern does not have specific UML notation. However, it can be identified easily and then we can rely on the rest of our UML notation to model it appropriately. To identify the pattern, one must inspect each many-to-many association to determine whether or not the same two objects can relate to each other more than once in that association. If the answer is no, then the use of the many-to-many association models the problem correctly. On the other hand, if the answer is yes, then the model is incorrect as it is trying to use a many-to-many association which cannot handle these situations. Instead, the many-to-many association must be removed from the model and in its place, a regular UML class is introduced that will link with each of the other two classes. This class will need at least one attribute that can be used to differentiate among the many instances of the same pair of objects relating to each other. Most often this is a unit of time giving rise to the name “many-to-many with history”.

Consider modeling a customer checking out a book. As discussed above, on the surface this seems to be a many-to-many association. Further analysis discovers that, in fact, it is incorrect to use that type of association. That is because a library allows a customer to check out a book once and then again, at a later time, for as many times as the customer desires. Now that we have established this pattern, we need to model this event of checking out a book by a regular class, in our case, the class will model the loan event. The UML class diagram below models this library loan design pattern.

Library loan class diagram
Library loan database UML class diagram. Other views of this diagram: Large image - Data dictionary (text)

Below we analyze further each of the associations shown in the UML class diagram.

  • Each Customer may make many Loans. Thus, someone can be a Customer but not yet have made any loan, as happens with anyone who signs up as a customer for the first time.
  • Each Loan is made by one and only one Customer. A loan can only exist if there is exactly one customer associated with it.
  • Each Loan checks out one and only one BookOnShelf. Similarly, a loan can only exist if there is exactly one BookOnShelf that is related to it, which is the book checked out.
  • Each BookOnShelf may be checked out by many Loans. A BookOnShelf might not participate in any loans, indicating the optional participation of the BookOnShelf in the checked out relationship. Also, a BookOnShelf can, over time, be the subject of many loans.
  • Each BookOnShelf is represented by one and only one CatalogEntry (catalog card). A BookOnShelf is the physical copy of exactly one CatalogEntry.
  • Each CatalogEntry must represent at most many physical copies of the same book-on-the-shelf. Finally, a CatalogEntry represents all the copies of the BookOnShelf that library has, thus a CatalogEntry can only exist if there’s a physical copy of a book that it’s representing, implying a mandatory participation.

Relation scheme diagram

This section did not introduce any additional UML notation, thus, the mapping of the UML class diagram relies on techniques described in previous articles.

As in the order entry example, the Customers table will need a surrogate key (added by us) to save space when it is copied in the Loans. The CatalogEntries scheme already has two external keys: the call number and the ISBN. The first of these is defined by the Library of Congress Classification system, and contains codes that represent the subject, author, and year published. The second of these is defined by an ISO standard, number 2108. We’ll use the callNmbr as the primary key, since it has more descriptive value than the ISBN and is smaller than the more descriptive CK of {title, pubDate}. Notice this is a great example of a relation scheme with three candidate keys.

Library loan relation scheme
Library loan relational database scheme. Other views of this diagram: Large image - Data dictionary (text)

The Loans scheme will include two FKs, one to reference the Customers scheme and another to reference the BooksOnShelf scheme. That maps the two one-to-many associations with Loans. This pair of FKs model the customer who borrowed the book, but, it's not enough to distinguish among different Loan instances of the same customer borrowing the same book. To distinguish among such instances, we have to know when it was borrowed; that is we need the dateTimeOut attribute in order to pair a customer with the same book more than once. The attribute dateTimeOut is a time stamp that includes the date and time, including seconds; for example: 2014-11-18 09:17:24. Such an attribute is known as a a discriminator attribute, since it allows us to discriminate between the multiple pairings of the same customer and book. Note that even if this had been modeled as a many-to-many association between Customer and BooksOnShelf and an association class to store the dateTimeOut, it still would be an incorrect model. This is why we need the Loan class along with the discriminator attribute.

In most cases like this, we would use both FKs plus the discriminator attribute dateTimeOut as PK of the Loans. In this particular case, the set of three attributes is not minimal. We need only the FK from the BooksOnShelf and the dateTimeOut (since it is physically impossible to run the same book through the scanner more than once at a time). Notice that there is actually another CK for loans: {dateTimeOut, scannerID}, since it is also physically impossible for the same scanner to read two different books at exactly the same time. We choose {callNmbr, copyNmbr, dateTimeOut} because it has just a bit more descriptive value and because we don’t care about size here (since the Loan has no children and thus its PK is not copied as a FK).