Here is a term paper on ‘Database System’ for class 11 and 12. Find paragraphs, long and short term papers on ‘Database System’ especially written for college and IT students.
Term Paper on Database System
Term Paper Contents:
- Term Paper on the Introduction to Database System
- Term Paper on the Database Structure
- Term Paper on Database Query and Scheme
- Term Paper on Database Administrator
- Term Paper on Data Processing
- Term Paper on Data Input
- Term Paper on Data Manipulation
- Term Paper on Data Output
- Term Paper on Data Organisation
- Term Paper on Data Files
- Term Paper on Processing Mode
ADVERTISEMENTS:
Term Paper # 1. Introduction to Database System:
A data base system is a collection of documents, procedures, programs, manuals, etc., which help in the efficient and effective operation of data processing with a data base in use.
The most important characteristics of a data base system are:
1. Controlled Redundancy: no unnecessary duplicate fields or data items exist.
ADVERTISEMENTS:
2. Complete Data Independence, or as much as possible.
3. Quick response to requests for information.
4. Ease in building up new information generating applications.
5. If required, real-time accessibility.
ADVERTISEMENTS:
6. Proper security protection.
7. Maintenance of data integrity.
The data base management system has to be a dynamic one, easy to absorb new technologies which are being made available increasingly. In data base systems, like any other computer systems, there are always two methods for showing the same thing — one is called the Logical View and the other, Physical View.
In case of a database, the physical view represents how the different data are actually stored in a computer, which is quite complex. The logical view refers to how the user views the data stored for his own use, which is quite simple, although it is the same data. In other words, the physical view and logical views are linked. But in modern database systems the linkage between the logical and physical data is made transparent, the user knows nothing about how the data is actually stored and linked with other data items.
ADVERTISEMENTS:
If that is not done, the cost of maintaining the current application programs would be much more than the cost to develop new ones. Hence, the concept of data base is shifting from mere independent file system to a common data base with data independence – the user being concerned with only the logical view, where data can be added, changed, deleted without having any adverse effect on the physical storage.
Term Paper # 2. The Database Structure:
Apart from storage of different group of data items under different files, also called tables, a data base system has to define the structure which links these files to each other. In fact, this relationship is what distinguishes a data base system from a system having independent, unconnected multiple data files, each going in its own way.
The data files generally contain data grouped on some common consideration, for example, there would be a data file containing the details of all employees relating to their names, address, family members, etc., and another file containing their salary details, promotion dates, etc. The reason for maintaining data files in this manner is to facilitate flexibility in use. For example, if the books on law, economics, engineering, and companies act required in a company, are all bound together, not only it would be a very fat book but it will prevent different persons to look for different matters simultaneously from the same bound volume.
ADVERTISEMENTS:
With the same reasoning different data files or tables are created. But, under a database system, these files are linked with each other, so that one can get all the necessary information at one time, even though the required data may be stored in different files. This linking of data files can be done in three different types called Hierarchical, Network, and Relational.
(i) Hierarchical Structure:
In this type of structure, the records or aggregates of data items are logically conceived to be stored at different levels of hierarchy, like a tree with many branches, but being up side down. The relation between entities is established in such a manner that a data item is linked with only one data item at the next higher level, though the reverse need not be true.
For example, a Department at Level 1 of hierarchy can have many employees at Level 2, but none of the employee can be in more than one department. It can be compared to a tree structure of a family using only the male or female member. A father can have many sons, but each son has one father at the next higher level.
(ii) Network Structure:
In this type of structure, multiple relations between data items are allowed. It would be somewhat equivalent to a family diagram with both parents incorporated; so a child has a father from one family branch and a mother from another family branch. In this setup an entity may be linked up to any number of other types of entities.
In a primary school, students usually have one teacher taking all classes, where as in a college there are a number of teachers for different subjects for the same group of students, with which the students have to interact. The former is hierarchical structure with the latter being of network type.
(iii) Relational Structure:
It is quite similar to the network structure but in which the relations are executed in a specific manner, making it much simpler to conceive and execute. Different data files are linked by having a common type of data item, that is, a common data item exist in each data files. For example, each student is given a unique roll number.
Then two data base tables are built up in one of which the performance of the students are kept in different fields and in the other, details about their addresses, guardians, etc., are kept — it also having a field containing the roll number. So the field of roll number is used to link up the two data tables for the necessary data in different combinations; taking some from one-table [file] and some from the other.
In this type of structure, the data is logically conceived to be represented in a tabular form. The rows of the tables are called records by ‘commoners’ and called tuple by enthusiastics of normalization. A tuple can be said to be the values of data items, sometimes called data elements, of any entity.
The term path is used to describe a tuple representing two values. In other words, if a data file has n records, it represents n values of the entity and is called N-tuple. Incidentally, the table is called a relation, and the columns are called domains. With n fields, it is called N-ary domain.
Advantages of Relational Database:
1. Extremely simple concept, easy to use.
2. Generally assists in achieving data independence.
3. Quite flexible in operation, can be easily extended.
4. Security of important data items can be ensured by isolating them as an entity having separate access control rights.
5. Structured Query Language can be employed.
Term Paper # 3. Database Query and Scheme:
With increasing use of relational data base, a need has been felt to allow users, with little or no knowledge of programming as such, to use the data base system even in on-line mode to get the desired information generated. A new language system has accordingly come up, generally called SQL or Structured Query Language — used in dBASE IV, Oracle, etc.
Schema:
Now, the involvement of a data base system is not merely limited to the storage of data; its complexity arises from the fact that it has to define the inter-relationships between various data items for efficient functioning of the data base and this has both physical and logical aspects.
When the two aspects have been insulated from each other, a general programmer operating at the high level has to know the details of the logical view of the data base — what entities are available with which attributes describing them and how these are related to each other, and this is called schema — a logical view of the data base. When a part of the logical view is considered, it is called sub-schema.
A number of matters relating to the data base system are described in the schema, some of which are:
1. Details of each of the data item or field, giving its name, type, field width, valid ranges, etc.
2. Details of formation of records from data items, and files from records; each representing a different class of entities.
3. The relationship models linking different entities whether hierarchical, network, or relational.
4. Security aspects of the data base — password levels, etc., for different operations or user classes.
To create a data base, the statements which describe it, defining the fields etc., are written into a Data Definition File. This data definition file is inputted to a Database Description Processor, which generates an output describing the data base in terms which the Data Base Management System [DBMS] can understand — it is like compiling a source code created in a high level language.
The Database Definition produced by the Database Description Processor is understandable by the Data Base Management System, whereas its input from Data Definition File is understandable by us.
The process is as shown below:
Now data can be entered, as required.
Term Paper # 4. Database Administrator [DBA]:
To ensure proper functioning of a data base system, it is necessary to appoint a data base administrator whose job is to plan, design, create, modify and maintain the data base of his organisation with special emphasis on security and data integrity. He is considered to be a custodian, not the owner, of the data values stored in the data base system. Ordinarily Data Base Administrator is not concerned with the details of the application programs used to manipulate the data of the data base system.
DBA maintains the Schema and the Data Dictionary. Any change in the form of a data item or its creation can only be done by the DBA; who generates his own report relating to his areas of activities for management. The Data Dictionary contain details about the type and size of different items available in the data base, so that a user can know which data files he can use to generate his desired information.
Term Paper # 5. Data Processing:
Data Processing in general refers to manipulation of data, manually or by using machines, to produce meaningful information. Here, our concern is mainly with computerized data processing systems. The data processing cycle involves input of data, manipulating or processing the data, and as a result, generating information, which is the output desired.
I-P-O or Input-Process-Output:
Diagrammatically, the cycle is:
Term Paper # 6. Data Input:
Before feeding into the machine, the input data need to be captured or recorded in a machine usable and readable form for processing by computer. For example, the computers in 1960s mostly used 80 or 96 column paper cards as input or source documents, called Punch Cards.
In such cases recording of input data involved punching rectangular holes in the cards using alphanumeric codes representing characters. In modern computers, the recording of input data is mostly through the keyboard entries; though other input devices are also in use.
Technically, recording means transfer of data into some computer readable forms or documents. Obviously, before inputting data, these must be collected from various internal and external sources, as the case may be. Collection means gathering relevant and necessary data for generating the particular information from a mass of data which are available in abundance in any business organisation or elsewhere.
For example, in an education institution, for processing examination results of students, the marks given by different examiners and nothing else need to be collected to prepare the basic data, which is entered into the computer by using the keyboard — all other type of data, like fees paid are irrelevant.
Term Paper # 7. Data Manipulation:
The basic data processing operation carried out on the input data to add meaning to it are generally, classifying, sorting, calculating, collating, merging, searching, and summarizing.
(i) Classifying:
It is the process of organizing data into groups of similar items — into small homogeneous groups based on some specific criterion. In a co-education class, the students’ data may be classified into, male and female students for analyzing whether there is a difference in performance on account of sex.
(ii) Sorting:
It is the process of arranging data in some predetermined logical order. For example, the name of the students may be arranged on alphabetical order from a to z, or the names arranged on the basis of total marks obtained by each in some examination, in descending order, starting with the highest marks.
The criterion used for sorting, like the marks in the second case, is called key for indexing. The sorting can be done in either ascending or descending order. Another method of arranging data in a sorted manner is called indexing, which is used for data files.
(iii) Calculating:
It is the process of carrying out arithmetic computation on numerical data from the simplest addition to the complex ones — although the computer basically carries out addition in various forms. This is the most common processing job carried out at the Arithmetic & Logic Unit (ALU); where logical computation involving Boolean Algebra is also carried out.
(iv) Collating:
It is the process of comparing different set of data and then carrying out some operation on the basis of the result of comparison. It is useful in the process of merging.
(v) Merging:
It is the process of creating a third set of data by combining two different sets of data having a common field, sorted in the same logical sequence on some criterion and then combining these two sets after collating.
(vi) Searching:
It is the process of locating a particular data item from a set of data items. It is required to confirm absence or existence of a particular value. The search operation fails if the item is not found. A number of searching algorithms are available, binary search being the most popular one.
(vii) Summarizing:
It is the process of creating a few concise data items out of a mass of data. For example, the average marks computed is a summarization of the individual marks of students in a particular examination.
Term Paper # 8. Data Output:
The activities coming under output operation are displaying, storing, retrieving, and communicating.
(i) Displaying:
It is the process of showing the outcome of a processing operation on the video screen, where as, printing is doing the same thing by typing out on paper using a printer — called Hard copy — the display is called Soft Copy.
(ii) Storing:
It is the process of keeping data in a physical storage medium like tapes or disks for future use. The data is transferred from primary to secondary storage.
(iii) Retrieving:
It is the reverse process of storing. This involves getting a data or a particular set of data from a mass of data stored on a physical medium. It does not destroy the data stored.
(iv) Communicating:
It is transferring data from one source to another. It may involve different geographical regions where networks are used for transferring data. Displaying and printing are also part of the process of communicating.
Term Paper # 9. Data Organisation:
It is rare that some data are entered into the computer system, processed to generate information and then thrown away, like we do when using electronic calculators. In practice, there is a definite need to store data in a systematic manner for future use and this has been the area which has received most aggressive attention of computer experts. In the process, new systems have come into existence, new terminologies coined; some time referring to the same idea.
Generally the terms in common use, relating to data base systems, have two origins — one is IBM’s Data Language I [DL/I] and the other is CODASYL’s [Conference On Data System Language] Data Description Language.
Although bits or bytes are the smallest unit of recording data physically, as per IBM’s DL/I, a Field is the smallest named unit of data, a segment is a named fixed-format quantum of data containing one or more fields which interfaces between the application program and the DL/I. A Logical Data Base Record consists of named hierarchy (tree) of related segments. A Logical Data Base consists of a named collection of logical data base records. It may contain one or more type of records. [The term “named” implies that a field, segment, or a record has a definite name given to it].
Diagrammatically it is like:
Expressed in simpler language [and ignoring the part of segments, as we are dealing with data bases in general, not restricting to any specific data description language], a Field is one of the smallest units of data within each record, containing specific bits of information relating to that record and having a distinct name. A number of fields built up a Record.
For example, in our personal address books we keep data/information of our friends and acquaintances. What do we note down? The name (first-name, middle-name, last-name), house number with street/road, city/town, state, pin code, telephone number, if any, etc. Now, as per data base terminology under discussion, the detail about each person will be recorded in different fields like one field for name, another field for house number with street, etc.
In fact it is our choice how we will treat the name and address in terms of fields. For example, we can break the name and use three fields for first-name, middle-name, and last-name — it depends on what we are going to do with our data. In the address book, details of each of our friends and acquaintances will have to be filled up in the relevant fields.
How do we accommodate them? We have one record for each friend and acquaintances. We keep all the records in one address book to readily access it — similarly with computers, we store all the records together in a data file with a distinct name to it — we may call it Address Data File. So, a record contains all the data about a single item, like that for our friends and acquaintances, in the data base file.
All fields in a data file are identical in form, with each field of each record containing different data. For example, the Field 1 of Record 1 may contain Ashok, that of Record 2 may contain Kapil, and so on.
A sample of a data base file is shown below, with a few items:
If you look at the logical view of the file [the way the user looks at the data] you will notice that it is a simple two-dimensional table, with fields as the columns and the records as rows. Such a two-dimensional table is sometimes called a flat file. The table of this type is also referred to as a relation.
A set of data items can be grouped in different ways to form different records for different purposes. A group of data item within a record is referred to as a data aggregate in some system, or segments [as called by IBM] in some other. Each data item / field has to have a distinct name.
Under the CODASYL terminology, a data item is the smallest named unit of data, which is called field in IBM’s terminology; both meaning the same thing. Under this system a record is a named collection of data items, a set is a collection of records forming a two-level hierarchy, and a data base consists of a named collection of records and the set relationship between them. [CODASYL used a network approach for storing data, where as IBM’s Information Management System used hierarchical approach.]
Whatever it is, whether we denote the smallest unit of data as a field or a data item, the basic objective is to provide a systematic storage of data, which can be easily stored, quickly retrieved and easily processed to generate the desired information. The first step involved to meet this objective is to classify data objects of similar types called entities.
For example, the students of a class can be termed as entities, because they are studying together in a particular class in a particular college. Similarly, the teachers of the institution could be called another type of entities.
Under the concept of entities, we need attributes to define the entities. For example, with students as entities, each student can be uniquely identified by certain attributes — which describe each of them. The name, roll number, address, age, etc., can be used as attributes to describe a student — the distinct set of attributes defining the entity called student.
Entities can be grouped together to form a common unit for storing, say, as Student File; in which each entity has a separate but identical data structure called records. The records, as we have already seen, has a number of components called fields for each attribute, which is also called data item. Thus, there could be a field for roll number, a field for name, and so on, one for each attribute.
The fields contain data elements which uniquely identify each entity — the field name ROLLNO denote a data item, in which roll number 46 as a data element would identify say Ashok Ray. [In some literature data item and data element are used synonymously]. To summarize, broadly speaking, fields constitute a record, records constitute a data file, and data files constitute a data base.
Term Paper # 10. Data Files:
Data base files can be broadly classified into two categories depending on the permanency of data stored in relation to time — called Master File and Transaction File. The Master File is a file of almost permanent nature which contain all the data required for a given application.
On the other hand, Transaction Files are created periodically to hold data relating to current transactions like sales, purchase etc., during, say, January 1993; when Master File for sales may contain the same details about year-to-date sales. Hence the Master Files need to be periodically updated with the data from the relevant transaction files.
The process of transferring current data of relevant records to build up cumulative total in the respective fields of the master file, or adding /deleting records in the master file based on the current data of the relevant transaction file is called Updating.
As far as storage mediums are concerned, magnetic tapes have some basic limitations during updating, because even a single unit of data called block cannot be accurately overwritten [called overlaying] by new or modified data; which is possible in direct access storage devices like magnetic disks, as sectors /clusters can be overwritten.
Hence the process of updating in tapes is carried out by creating a new sorted tape file with the records to be amended/ changed/ deleted/ added and then the old master file and the new file containing amendments are run concurrently to create a new master file by merging — the new master file now containing the updated records. This new master file is naturally used during the next process of updating; creating another new master file.
Generally, the original master file is not erased (destroyed) till the second master file is created as a precautionary measure against accidental loss of data. Hence, at any instance, we have three sets of master files for any application called three generation of master files — father, son, and grandson. These type of operations are done with batch processing.
In many file operations, even with direct access storage devices, a copy of the file being handled is automatically retained along with the edited version of the file. This process is called auto-backup of file or sometimes, transaction logging and it is done as a precaution against damage to the data file inadvertently caused by software / hardware failure or mistakes — the term crash being generally used to denote destruction of files.
For example, when the Line Editor called EDLIN of MS DOS or screen editor like Wordstar, Sidekick, etc., are used to open an existing file, the original file is retained with a file extension of .bak and the new file is saved under the original filename extension, if any; the filename remaining same in both the cases. In case of a file crash, the father file with .bak extension can be used to create a new son. It is always a good practice to deliberately create backups of files as a safeguard against file crash.
In fact, MS DOS provide two utility files called BACKUP and RESTORE, specifically for this purpose. Backups can also be created in different storage media like disk files being backed up in tape cartridges; generally called check-pointed. All these measures have been developed to maintain the integrity of the data stored, be it a love letter or a financial ledger. These days, a number of utility programs are available to create backup of data and programs.
Term Paper # 11. Processing Mode:
Coming a long way from the days of mechanized accounting to EDP [Electronic Data Processing], the processing done in a modern computer can be carried out in many ways, being classified according to the time of processing in relation to the input of data — whether the output is available within seconds of inputting data or it is available after days or even weeks.
The two main techniques are classified into:
a. Batch Processing and
b. Interactive Processing.
(i) Batch or Sequential Processing:
The input data are collected and kept in batches or groups according to the output to be generated. Then at some predetermined time, all the input data of one batch are processed together in one go. For example, after processing all the input data relating to Payroll Accounting, the stores consumption data may be processed as another batch. Normally this type of batch processing is carried out at centralized computer centres.
(ii) Interactive Processing or On-Line Processing:
Here the input data is processed the moment it is entered into the computer system, producing the necessary output, as we see in the Computerized Railway Reservation System. Obviously, it would lead to a chaotic situation if all the requests for a railway reservation are processed on weekly basis in a batch mode — this output has to be known quickly. It is called interactive, as the user is in direct communication with the computer. The term on-line indicates equipments which are in direct contact with the active computer system — which the respective terminals are.
(iii) Real Time Processing:
It is a special case of interactive [on-line] processing where the emphasis or the critical factor is the response time — the time required to process the input to generate output. It is generally adopted in cases where the computer controls other machines, where quick response is a must. For example, when computer controlled guns are fired on attacking enemy planes, the calculations to find out the firing angle has to be done quickly, so the shell hits the aircraft.
(iv) OLTP – On Line Transaction Processing:
This is also a case of interactive processing, where once the input request for a transaction, like say money transfer from one place to another, is received, it is completely processed before taking up another input. It is generally used in networks providing almost instantaneous service.
(v) In-Line or Random Processing:
Here the selected jobs are processed as per some priority scheme. Once the processing of a specific job starts, it is completely processed, generating the final output.