Flat File vs Relational Database System Assignment

My latest assignment for my Data Visualizations class give me this case,

"You have some concerns about moving your entire airline operations out of SQL to this flat file format. Write your boss a memo, outlining any concerns or hesitations you have about moving to this format for management of your data. Include the pros and cons of the relational database format and the flat file format. Be sure to think critically, and include any problematic use case scenarios."

Sir,

I know from the last few staff meetings that we are looking to change our database configuration from a traditional relational database mangement system (RDBMS) to a flat file storage based system.

Advantages of a Flat File System

Two of the main advantages of a flat file system are the simplicity of record storage and the ease of use of the data.

Record Storage

One of the main advantages of a flat file based system is having all of the available data in the same location. This means that all the data available is within any given record. In the case of our data a flat file system would look like this.

Flight_ID Airport_Code_Origin Airport_Code_Destination Departure_DateTime Arrival_DateTime Airport_Code Airport_Location Year_Opened Num_of_Terminals Manufacturer Model_Num Original_Purchase_Date Last_Service Number_of_Seats Carrier_Name 1 MIA JFK 2/20/16 23:26 2/21/16 4:04 JFK New York, New York 1943 12 Boeing 737-900 12/2/09 1/26/16 500 Virgin Atlantic
2 MIA SFO 2/21/16 8:55 2/21/16 9:11 SFO San Francisco, CA 1927 8 Boeing 737-900 12/2/09 1/26/16 500 Southwest Airlines
3 LAS PHL 2/21/16 12:51 2/21/16 15:18 PHL Philadelphia, PA 1927 6 Embraer RJ-45 11/25/08 1/27/16 550 Delta
4 SFO PIT 2/21/16 21:45 2/21/16 23:18 PIT Pittsburgh, PA 1946 4 Boeing 747-400 10/25/01 1/3/16 250 Southwest Airlines
5 IAH PIT 2/22/16 19:34 2/22/16 22:27 PIT Pittsburgh, PA 1946 4 Airbus A330 12/2/01 12/16/15 400 Virgin Atlantic

Each flight record would include all the necessary fields to describe flight. This setup makes the data very readable and understandle to anyone who has access to the file.

Ease of Use

The second advantage we will examine is the ease of use for the database. A flat file system can be viewed from any number of applications making is very accessible. Users will also have very little difficulty with understanding the data because each record contains all available about a given flight. Simple queries and sorting should be no problem for most flat file based systems.

Disadvantages of Flat File Systems

While a flat file based system can work for some datasets I do not think it is appropriate for us. There are several disadvangtes to using a flat file system for our data. These include data duplication, difficulty of updating, and data security.

Data Duplication

In the example data extract shown above records 1 and 2 have flights orignating from the MIA airport. Each record includes the data about that airport. This data duplication causes the size of the flat file system to increase with unnecessary data. Not only is the originating airports data duplicated, the aircraft data is also duplicated for every flight on that days route. This duplication does not exsist in our RDBMS because the airport data is only stored once. The same is true for aircraft and carrier data. This leads us into the next disadvantage, difficulty updating records.

Update Difficulty

As was mentioned in the last section there is a lot of duplicate data in the flat file. Imgaine when an update needs to be made to a piece of data. In the case of an airccraft we keep track of the last service date. When and aircraft is serviced we will need to update every instance of that aircraft within the flat file. That would me traversing the entire data file and updating the last service date. Even with an automated update script it would be a time consuming and error-prone process. This is just one use case where updates could be difficult. The same process would need to be applied for airport and carrier data. There would need to be multiple users accessing the data to process the amount of updates we would generate. Who would have access and how would we control it is the last disadvantage I would like to expand upon.

Data Security

Data housed within flat file systems is hard to restrict. Only file level restrictions can be applied in this type of configuration. Individual records cannot be protected. A user who has access to the file has access to all of the records. In our system we need to restrict the level of access on a per flight basis. This would not be possible in a flat file system.

Recommendation

My recommendation is that we keep our system as it currently configured. Since we are using a RDBMS our data can be accessed in a variety of different ways. From screens within the airport terminals showing read only listings of flights to gate agents updating flight records our data is much more customizable to our needs in a RDBMS.