Wednesday, May 14, 2008

Table normalization verses long-term data storage

I'm currently working on an application that involves long-term storage of assessment data. Users submit records of their activities and assess their performance, and then reviewers look over those assessments and denote whether they agree or disagree with them. Each assessment database record is related to a reviewer through the unique reviewer id that is part of the assessment record, and I can use that relationship to retrieve the reviewer's name whenever I display the assessment record.

It's a standard example of table normalization. If the reviewer's name was stored within the assessment record itself, and the reviewer changed their name for some reason (marriage, divorce, mid-life crisis, etc.), the application would have to update the name in both the reviewer's record AND the assessment record. But by using the reviewer's id in the assessment record to establish a relationship between the assessment record and the reviewer record, the reviewer's name only needs to be recorded or updated once.

However, this project will entail keeping the assessment data for an undetermined number of years. With the data arrangement I just described, that means I would have to store the assessment records and all of the related reviewer records if I want to be able to keep showing the name of the reviewer when looking at older assessment records. That could result in keeping a lot of extra data about reviewers (addresses, e-mail addresses, logins, passwords, etc.) who are no longer associated with the program simply because we need to keep their name tied to the assessments.

I think this is one of those situations where it makes sense to repeat a little data. Recording the reviewer's name in the assessment records allows me to let the administrative users of the application delete reviewer user accounts without impacting historical data. It means a bit more work in keeping the reviewer's name the same in both records, but in the long run I think it's worth the effort.

No comments:

Post a Comment