Skip to content

Data Modeling Question

August 14, 2010
A data relationship model showing four data tables and the relationships between them

Media Asset Credit

There are a few development projects on my plate right now which have me questioning how I have traditionally developed data-rich websites in Drupal. I’d like to share some specifics here, to see if anyone has any feedback about possible alternatives.

The project in question is for a digital image library with a robust set of metadata in conjunction with the images themselves. The dataset is reasonably large – 34,000+ images, and by the time you include the nodes for creators, works, locations, etc you have somewhere around 200,000 nodes in the system. However, a large chunk of these nodes are wrapped up in what I’m calling “credit” tables. The image at the top of this post illustrates the relationships.

On the right side, we have the table which contains the media assets themselves. There are 34,000+ records in this table. Each of these media assets is related to one or more people or organizations who have a very specific relationship to the media asset. For example, the media asset has been given to the library by someone (who in this case acts as a “contributor”). Someone else may hold copyright over the image, however. A third person may have been the actual photographer. Each media asset in the library will have, on average, three credited relationships to someone from the creators table. This means the media asset credit table currently has over 100,000 records.

So far, I’ve built three of these tables as content types using CCK (Media Asset, Creator, and Media Asset Credit). The fourth pictured table, Media Asset Role, has been created as a taxonomy as there isn’t any metadata about these roles which we need to store.

The problem that I’m having is that, in Drupal terms, it doesn’t seem quite proper for the Media Asset Credit table to be its own content type. These records don’t have anything I would call a title, or a body. If I were building this outside of Drupal, the credit table would simply be a link table – if I were feeling super efficient that day I might even define a multi-field primary key on the three foriegn key fields, and be done with it.

So this is my question:

What is the best way to create link tables in Drupal?

Is it appropriate in your mind to build out this Media Asset Credit table as a content type via CCK? Or is this something that would be better tackled in some other way? I don’t have the skills (yet) to do any custom module development, but there’s a voice in the back of my head that is starting to whisper about that being a better solution.

Another thought which has occurred to me is to abandon the Media Asset Role taxonomy, and add a series of node references to the media asset table – one for “contributor”, one for “copyright holder”, another for “photographer”, etc. This flies in the face of data normalization, but it would be a way to get rid of this very large table. Doing so, however, would come at the cost of making certain views we need to build much more difficult – if not impossible – because of how the view relationships would then have to be created.

If you have any thoughts here, I would love to hear them. Please leave a comment below.

One Comment leave one →
  1. August 14, 2010 4:30 pm

    For this particular use case, you might want to check out Islandora (http://islandora.org). It’s a Drupal front end with a digital archiving backend built on the Fedora Commons. It’s made for this kind of use case. Feel free to contact me if you want more info.

    Re: Media Asset Credit. I’m not quite sure what you’re doing here. When you add an Asset, you then have N Credit references, which link to Creator nodes? Oh, I get it – you’ve got this “link” table because you’ve got Roles that you need to label the relationship with — i.e you need both the label and the reference. No, it’s not the right way to do it, but this is how you CAN do it without coding. You could use Auto nodetitle and other things just to automate that table, but this is not ideal.

    I don’t want to send you down a rabbit hole, but this is where “multigroup” fields in CCK come in. There used to be a CCK 6.3.x dev branch that included this feature, and there is lots of discussion on Drupal.org about how to make this work for D7. See http://drupal.org/node/494100

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: