Managing Query Execution in Database Engine Part – 1
Introduction
In this article we will be discussing about the query handling process in Database Management System (DBMS). In many Database Management System (DBMS), queries are placed in a non procedural language like Structured Query Language (SQL) and as it was mentioned in earlier articles that such queries does not include any reference of accessing paths otherwise the sequence of assessment of actions. The query handling process for this type of queries by means of a Database Management System (DBMS) commonly includes the subsequent four (4) stages:
1. Parser
2. Optimization
3. Coding
4. Implementation
The analyzing stage principally examines the query for accurate coding as well as decodes it into an orthodox parse tree (frequently called a query tree) otherwise several additional internal illustration.
If the parser gives back no errors, as well as the query make use of user defined views, it is essential to elaborate the query by means of suitable replacements for the views. It is then essential to examine the query for semantic precision by means of referring the system catalogues as well as to perform a check for semantic mistakes in addition type compatibility in both expressions as well as predicate assessments are also checked.
The optimizer is then called with the internal illustration of the query as an input as a result a query design or execution design can be planned for recovering the data which is essential. The optimizer performs a number of actions. It transmits the representational names in the query to database items as well as examines their presence in addition inspects if the end user is lawful or not to carry out the actions which the query stipulates.
In communicating the plans, the query optimizer gets the pertinent data from the metadata (data about data) that the system preserves as well as tries to prototypical the projected costs of carrying out several substitute query designs in addition then chooses the finest among them. The metadata or else the system catalog is consists of explanations of all the databases that a Database Management System (DBMS) upholds. Frequently, the query optimizer will at least recover the subsequent data:
1. Cardinality of every single relation or table of interest.
2. The number of data pages in every single relation or table of interest.
3. The number of unique keys in every single index of interest.
4. The number of data pages in every single index of concern.
The aforementioned data as well as possibly additional data will also be castoff by the optimizer in displaying the cost estimate for every single substitute query design.
Significant additional data is generally present in the system catalog:
1. Label of every single table or relation as well as all its columns or attributes in addition their areas.
2. Data about the Primary Key (PK) as well as Foreign Keys (FK) of every single table pr relation.
3. Explanations of views.
4. Explanations of storing arrangements.
5. Additional facts counting data about ownership as well as safety issues.
Frequently this data is modernized occasionally as well as not at every single time of update, insert, or delete. The system catalog is regularly kept as a relational database itself creating it stress-free to query the catalog when an end user is approved for doing so.
Data in the catalog is very significant unquestionably as query handling makes practice of this data comprehensively. For that reason, more inclusive as well as more correct information a database preserves the improved optimization it can perform on the other hand preserving more inclusive as well as more precise data, familiarizes extra expenses too in addition a worthy steadiness for that reason must be establish.
The catalog data is castoff by means of the optimizer at the time of access path selection too. These figures are frequently restructured occasionally and for that reason not at all times it is precise.
A significant portion of the optimizer is the module which refers the meta – data that is kept in the database to get the figures about the referenced relations or tables as well as the access pathways that are obtainable on them. These are castoff to control the most capable sequence of the relational processes as well as the most well-organized access pathways. The sequences of actions as well as the access pathways are carefully chosen from a number of alternative options that generally exist so that the cost of query handling is reduced. Further particulars of query optimization are explained in the next section.
When the optimizer finds no mistakes and gives an execution plan, the code originator is then invoked. The execution plan is castoff by means of the code originator to create the machine language code as well as any other connected data structures. This machine language code can now be kept if the code is to be expected to be implemented more than once. To implement the code, the machine handovers the control to the code which is then implemented.
In the upcoming part we will be discussing the Query Management process in details.