In my dreams I'm hoping to find time to find data for exactly when/how often each character has appeared, whether or not in their own story. Barney does a pretty job of this, e.g. noting when Anderson and Hershey are in some epsiodes of Judge Dredd -
Ah, yes - me too. Whenever I think about this I think that there's no point in the project unless it improves on Barney (because we already have Barney).
My own data is in a terrible format when compared to a relational database, but it is useful for me - it allowed me to quickly make all the charts on this thread with just a few minutes work.
But a giant table that repeats information all over the place is not efficient (or easily searchable) - and generally doesn't include character information. I do try to get crossover stuff in there, but it involves duplicating rows. Here's my current example of Karyn, which I know is missing some key stuff (like Martyrs, for example):
Very interesting.
FWIW I've also thought about this quite a bit. Off and on I've been messing around trying to create a little application to create my own bespoke digital GNs using a combination of the cbz files I've bought, Barney and an index of content.
Say I want to generate my own collection of every Sinister Dexter story since they arrive in Generica. I have all the files. I can get the relevant prog numbers from Barney. I can use that list to filter my files. The biggest problem I have is getting the right pages from those files into my new one. (The second biggest is handling correct page orientation in case there are splashes.) Keeps me occupied.
About your ER diagram
... takes deep breath...
If you're thinking about this as an exercise in data modelling and design then that's fair enough but as something functional for working with such a small dataset it's way too complex to me. I didn't create Barney and I'd love to be able to change some things about the database. However, it does quite a few cool things with a simple underlying structure just by smart querying.
You've already identified the biggest issue: if you normalise everything to this extent you have to spend ages building the interface to populate it. For querying and reporting though, a few views on top would give you everything you want.
Some specific comments/questions:
- I honestly got a bit lost in the supertitle/title/subtitle structure but I assume supertitle = strip and title = story name. The interface tables look pretty redundant unless you're really desperate to count the stories from the 2020 sci-fi special under both strips and, if I follow the example of "Sinister Dexter: Bulletopia: Ghostlands" being broken down, you're getting dangerously close to tblWord(ID(int), WORDNAME(varchar32))
For me Strip and Title are enough but if you want to have all the others separate then a single with the all the fields and just coalesce them into a single value for display.
- You don't need a separate table for psuedonyms. Put them in the main creator table and add a self-key field referencing the "real" creator. In your design there's no way to get from a specific pseudonym to the strips it was used on. This way you can record the pseudonym in the main "gig" table and switch between displaying real and pseudo much more easily. Of course I'm not sure how you would handle the Alan Smithee case where multiple creators use the same pseudonym at different times...
- The biggest gap (as I'm sure you're aware) is the comics themselves. As far as I can see you have one table "comicTitle" but you're going to need something to classify those into prog, meg, etc and order them. But how to order them? I'm in love with the cover date but then how do you sort everything so that the Meg 427 (which came out on the same day as prog 2212) doesn't come after prog 2216. Do you use [prev issue cover date] + 1? Or add a separate ordering field? Lots of fun, which also brings us to...
- Content. If I follow the design correctly, datestart and dateend don't belong in this table. They're both tied to the cover date of the issues they appeared in so should be pushed back there.
- Strip/Supertitle vs character. I understand why you need both, but are you really going to have [Strip].[Judge Dredd] and then add [Character].[Judge Dredd] to every story?
- Gig. For any attempt to get page counts by artist, for example, this is a huge problem. Just "episode" isn't enough. How do you handle something like the Anderson story Undertow or Victims of Bennet Beeny? No detail = all episodes, otherwise break out?
- Primary Keys. I am also old and wedded to the idea of a generated, numeric key but we don't have to think this way! PROG, MEG, INKER are all fine.
Wow, this is the longest post I've typed in ages. This was fun, lets do it again soon.