One of the challenges of working with databases is that the structures required to store data for maximum flexibility sometimes end up being an obstacle to providing quick answers to routine questions. I have been finding that at work recently, with requests for information from one of our research data sets taking a lot longer to answer than I would like, along with a constant danger of apparently innocuous queries disappearing off the time-out cliff.
Following some research and experimentation, I have decided that the pragmatic solution is to have two databases. One is the existing one, primarily focused on capturing and storing the information. The other is produced by a series of queries that build a second copy of the data set with a much simpler structure. I deliberately flout many of the principles of data normalisation because most of the people who actually want the data won’t be accessing it directly but require a simple flat file in which all the information about a given case is rolled up into a single row.
With this simplified version, I have already been able to answer one request very quickly (well, discounting the time it took to develop the concept!); hopefully I will have more requests roll in soon to help prove and refine the process while it is still fresh in my mind.