Home > Blog > Data-orientation vs. Document-orientation 2: Just how normal are you?

Data-orientation vs. Document-orientation 2: Just how normal are you?

Posted by jimcarls on September 19, 2011

In discussing the limitations of "document-oriented" thinking in the last segment, I discussed one end of the range of ways to organize FF&E specifications data — looking at a sprawling, "all-in-one document" approach that recorded all the information about a project's FF&E specifications and their use in a spreadsheet.  In this one, I will look at how the search for more efficient ways to create documents takes us in the direction of a very specific way of organizing data.

Keeping an all-in-one "flat" type of spreadsheet up to date (and intact) takes a lot of effort. It quickly leads most designers to simplify the whole approach, often creating just two things:

1) A simple table (or "schedule") showing where FF&E objects occur using Room IDs and object Tags, either as a single table (like below) or separate tables for each room type:

 project_spreadsheet_norm_1a.png

2) Object specification sheets in a word-processor (or multiple worksheets in a spreadsheet) explaining the objects to which the Tags refer:

project_spreadsheet_product_sheet.png

In doing this, they have stumbled upon one of the most basic (and cryptically named) ideas of database management: "normalization." To normalize data simply means to separate it into its unique categories, similar to having file drawers in which each drawer represents one and only one category of thing and each sheet of paper in the drawer represents one unique example, so that if you needed to know what the data was for that thing, you can be sure that there is only one "official" source for it.  This gives you better control over changes and reduces duplicate data entry.  When a single category of data is placed in a much simpler type of spreadsheet layout we call a "table," the table itself is represented by the file drawer while each sheet in the drawer always corresponds to one row in the table, so the spec sheet above can also be represented as a row in an object table.

TableToFileCabinet.png

Each row in a table represents one unique item or one sheet in an imaginary file drawer.

Now, if a room item needs to be corrected or replaced, it's easier to make a change by changing something on that one specification sheet or by pointing to a different sheet (the latter keeps the old spec intact in case someone changes their mind). For the most part, changes to the product spec details only have to be made on that sheet. 

That said, our new approach still falls short of complete normalization, because the costs (a characteristic of a product) and the room count (a characteristic of a room) are needed on the spreadsheet to calculate the extended costs. That means that the product price is repeated in each room where the product is used and we still have a potential source of errors if the product price or room count changes. Further, the data sheet contains information about the client, the shipping address and the vendor that must be copied over and over if the data sheet document is the source of an object's product data.

However, this simple arrangement works well for many people — as long as you are scrupulous about following through on changes and never want to sort the spreadsheet list by something that is not on it (like vendor, product type or an accounting code).  This simple shift from our first all-in-one spreadsheet to our somewhat "normalized" arrangement is more efficient and has us moving towards "data-orientation," but the limitations just mentioned show that we aren't there yet. Really...wouldn't you like to be able to sort the projected costs and prices by vendor? Wouldn't you like to be able to update a product cost in just one place and have your totals updated automatically? Can you get a project total for a specific object's quantity with this arrangement?

To see how data "wants" to be organized efficiently, let's return to the filing cabinet metaphor. In a more normalized organization of data, there would be a "drawer" for rooms, one for finished FF&E objects, one for the products to be used in those objects and one for the vendors of those products. A sheet from the Object Drawer might have a generic functional description of the object ("Club chair") and a budget, but would actually point to the Specifications drawer for the products to use and the actual price. Sheets from the Specifications drawer would have everything about the required products, but no detail about the vendor except a reference to the sheet in the Vendor Drawer where the details about the vendor could be found. All products for that vendor would reference that sheet. The specifications sheet would therefore only include the "ID" of the manufacturer or vendor of that product and that would be the key to finding the rest of the vendor information in the vendor drawer.

Now, we're getting closer, because to communicate to someone everything about a chair that goes in a room, you simply pull  sheets from drawers:  A sheet for the chair object, which tells you to pull a sheet for the chair frame and a sheet for the fabric (if from another source).  Each of these tells you which sheet(s) to pull from the Vendor file.  So instead of writing a document, you are simply stapling a set of sheets together.  If you imagine that the sheets are made of a translucent paper like vellum (or like "layers" in a CAD system), you now get the effect of looking at a single sheet and seeing related information compiled from different source files (tables).

When you begin to treat what you see on a report or a computer screen as the end result of a process that assembles "documents" from tightly organized tables of data, you will have the key to understanding data-orientation.

In my next entry, I'll develop the idea of "normalization" further. 

Probably further than you really want to go.

However, even a basic understanding of how a true database works can open up opportunities for not just quickly creating accurate documents, but for reusing and manipulating that information to leverage your previous work in future projects.

Bookmark and Share

Comments: