Data Modeling Question
There are a few development projects on my plate right now which have me questioning how I have traditionally developed data-rich websites in Drupal. I’d like to share some specifics here, to see if anyone has any feedback about possible alternatives.
The project in question is for a digital image library with a robust set of metadata in conjunction with the images themselves. The dataset is reasonably large – 34,000+ images, and by the time you include the nodes for creators, works, locations, etc you have somewhere around 200,000 nodes in the system. However, a large chunk of these nodes are wrapped up in what I’m calling “credit” tables. The image at the top of this post illustrates the relationships.
On the right side, we have the table which contains the media assets themselves. There are 34,000+ records in this table. Each of these media assets is related to one or more people or organizations who have a very specific relationship to the media asset. For example, the media asset has been given to the library by someone (who in this case acts as a “contributor”). Someone else may hold copyright over the image, however. A third person may have been the actual photographer. Each media asset in the library will have, on average, three credited relationships to someone from the creators table. This means the media asset credit table currently has over 100,000 records.
So far, I’ve built three of these tables as content types using CCK (Media Asset, Creator, and Media Asset Credit). The fourth pictured table, Media Asset Role, has been created as a taxonomy as there isn’t any metadata about these roles which we need to store.
The problem that I’m having is that, in Drupal terms, it doesn’t seem quite proper for the Media Asset Credit table to be its own content type. These records don’t have anything I would call a title, or a body. If I were building this outside of Drupal, the credit table would simply be a link table – if I were feeling super efficient that day I might even define a multi-field primary key on the three foriegn key fields, and be done with it.
So this is my question:
What is the best way to create link tables in Drupal?
Is it appropriate in your mind to build out this Media Asset Credit table as a content type via CCK? Or is this something that would be better tackled in some other way? I don’t have the skills (yet) to do any custom module development, but there’s a voice in the back of my head that is starting to whisper about that being a better solution.
Another thought which has occurred to me is to abandon the Media Asset Role taxonomy, and add a series of node references to the media asset table – one for “contributor”, one for “copyright holder”, another for “photographer”, etc. This flies in the face of data normalization, but it would be a way to get rid of this very large table. Doing so, however, would come at the cost of making certain views we need to build much more difficult – if not impossible – because of how the view relationships would then have to be created.
If you have any thoughts here, I would love to hear them. Please leave a comment below.