Managing Data With Database Tools
Of all the tools in the information technology arsenal, none is arguably more important to business operations than the databases used to organize, store, access, update and manage large structured collections of information. Today, there are four major models that represent how databases are organized:
- Hierarchical: Database records are organized into a classic tree structure, where a single record or set of data acts like the root, and other records may be children of that root, or children of its children, and so on. Parent-child relationships are indicated by pointers, where parents have links to all their children, and children may have links to their parent. Hierarchical databases handle one-to-many relationships well, but don’t handle many-to-many relationships well.
- Network: Also known as the entity-relationship model, or ER model, this approach uses sets to represent relationships, rather than hierarchy. The network model is a subset of the hierarchical model, but also allows child records or sets of data (also called tables) to have more than one parent. This makes it easy to represent many-to-many relationships. Though more flexible and powerful than the hierarchical model, this model proved too difficult for ordinary end users to employ.
- Relational: This model is built around structured collections of data called tables. (Horizontal rows in the table are records; vertical columns in the table are fields or attributes.) Storage location is no longer important. One need only know that tables have unique names, so that names lead users to data without requiring them to navigate around inside tree structures or using notions of structure to access, insert, update or delete records from a database. Instead, search criteria may be used to locate unique tables, and to extract data from such tables.
- Object-oriented: This model defines database contents in terms of classes, subclasses and the relationships among them. Also, properties define data item (like attributes or fields) characteristics, and methods define the operations that can be applied to records or their component parts. Object-oriented (OO) databases are typical extensions of relational databases, but are able to support more complex and varied types of data, and to perform more specialized and complex operations on such data.
The first two models were developed in the 1960s and haven’t been used much since the 1980s (except in Information Management System, or IMS, and similar database management systems, or DBMSs, which remain in wide use to this day). The second two models enjoy a considerable degree of overlap, so that while some databases may be purely relational and not OO, nearly all OO databases also are relational.
Understanding What DBMSs Do
A DBMS is a computer-based system that provides mechanisms for storing, retrieving and updating structured data. A DBMS also provides lots of additional capabilities that cover a broad range of features and functions:
- Logical database views: Sometimes, database contents may be confidential or sensitive, so that only those with a need to know should be allowed to see, add or update such information. Sometimes, users might be allowed to create statistical views of such data without being allowed to see individual, underlying records. At other times, only authorized personnel might be allowed any interaction with such data. Most modern databases support logical views, which show users only the database structure and contents they’re allowed to see, and deny them access to other parts of the database.
- Database administration: This category of activity includes designing database structures and logical views, and managing user rights and permissions. It also may occasionally involve importing data into the database from other sources, or exporting data into other applications or services for re-use or analysis. Only trained, authorized database professionals are generally allowed to define database structures or to manipulate and maintain them. The same goes for managing users and access, importing or exporting data and so forth. This is usually the province of a class of professionals known as DBAs, or database administrators. Lots of DBMSs offer certification programs to make sure people tasked with the DBA role are prepared for and qualified to do the job.
- Reporting: Complex, systematic database queries produce special documents called reports, many of which are run at weekly, monthly, quarterly or yearly intervals to meet internal or external reporting requirements. Somebody has to design and build reporting tools (usually DBAs and other database professionals or database programmers). Somebody also has to run those reports as needed (usually DBAs or their subordinates, sometimes called database technicians or database operators).
- Data integrity management and availability: For legal, regulatory, financial or business reasons, companies and organizations not only must store important data in their database, but also must keep its contents safe from loss and harm. Backup, shadow copies, rollback and recovery are all mundane but important capabilities that most modern databases support in various ways. Above and beyond protecting the contents of databases, making them available to users as quickly and reliably as possible also has spawned all kinds of interesting database technologies for clustering, failover, and real-time mirroring and shadowing capabilities.
- Application support: For everything from generating reports to driving human relations, accounting, manufacturing, warehousing, shipping and other business operations, databases often provide the storage for key business information, while various types of custom applications provide the “in and out” functions for such data and permit it to be used in all kinds of creative ways.
- Web access: There’s a huge class of application programming interfaces (such as .NET, Java 2 Enterprise Edition, ASP, Java Server Pages and so forth) designed to bring database-driven information and services to the Web, and to interact with users over the Web to ferry data back the other way. This also helps to explain the nearly universal support for XML in most DBMSs and application development environments as well—and often, the two do meet on the Web these days.
As data collections get bigger and more complex, most DBMSs are designed to scale to match related storage, management and access needs. Thus, it’s also true that most serious DBMS environments can work in distributed fashion, as well as from a single database server (or cluster of servers). This means that multiple collections of data can be managed and accessed from multiple database servers that work together to provide a consistent and coherent logical view of the whole shooting match, also while managing concurrency and access controls, ensuring data and transaction integrity, and so forth.
In short, a modern database management system is a large and complex processing environment. It’s usually host to all kinds of tools and services for managing and administrating, but also for supporting numerous line-of-business applications and services for everything from accounts payable to enterprise resource planning (ERP). Many large-scale deployments cost millions, if not tens of millions of dollars, so it’s no wonder that nearly all of the major DBMS platforms covered in Figure 1 also are buttressed by comprehensive training and certification programs for database administrators, and often for database operators and application developers as well.
Where the DBMS Market Stands
On one side of the DBMS market, you’ll find a handful of companies and platform