Design pattern: multivalued attributes (hobbies)

Modeling hobbies for our contacts

Introduction

Attributes (like phone numbers) that are explicitly repeated in a class definition are not the only design problem that we might have to correct. Suppose that we want to know what hobbies each person on our contact list is interested in (perhaps to help us pick birthday or holiday presents). We might add an attribute to hold these. More likely, someone else has already built the database, and added this attribute without thinking about it.

Class diagram showing an attribute named hobbies which is expected to store many values
Contact class diagram now also modeling hobbies.

The multivalued attribute is obvious in this example as its name is in plural. Be aware that this won’t always be the case. We can only be sure that there’s a design problem when we find data in a table as depicted below.

Contact hobbies
contactidfirstnamelastnamehobbies
1639GeorgeBarnesreading
5629SusanNoblehiking, movies
3388ErwinStarhockey, skiing
5772AliceBuck
1911FrankBordersphotography, travel, art
4848HannaDiedrichgourmet cooking

In this case, the hobby attribute wasn’t repeated in the scheme, but there are many distinct values entered for it in the same column of a row. This is called a multivalued attribute. The problem with this design is that it is now difficult (but possible) to search the table for any particular hobby that a person might have, and it is impossible to create a query that will individually list the hobbies that are shown in the table. Unlike the phone book example, NULL is probably not part of the problem here, even if we don’t know the hobbies for everyone in the database.

Using UML Multiplicity for multivalued attributes

In UML, we can again use the multiplicity notation to show that a contact may have more than one value for hobby.

Revised contact hobbies class diagram, now accurately and explicitly showing that the hobbies attribute can have many values
"Revised contact hobbies class diagram.

Mapping to the relational model

As you should expect by now, we can’t represent the multivalued attribute directly in the Contacts relation scheme. Instead, we will model it using its own relation scheme. Thus, we remove the old hobbies attribute and create a new scheme, very similar to the one that we created for phone numbers in the repeated attribute design pattern.

Contact hobbies relation scheme diagram with a relation scheme for the multivalued attribute hobbies
Contact hobbies relation scheme.

The relationship between Contacts and Hobbies is one-to-many, so we create the usual pk-fk pair. The new scheme has only one descriptive attribute, the hobby name. To uniquely identify each row of the table, we need to know both which contact this hobby belongs to and which hobby it is—so both attributes form the pk of the scheme.

With data entered, the new table looks similar to the PhoneNumbers. It can also be joined to Contacts on matching pk-fk contactID pairs, re-creating the original data in a form that we can now conveniently use for queries.

Hobbies
contactidhobby
1639reading
5629hiking
5629movies
3388hockey
3388skiing
1911photography
1911travel
1911art
4848gourmet cooking