查询处理
Query Processing | 查询处理
查询处理流程
In PostgreSQL, although the parallel query implemented in version 9.6 uses multiple background worker processes, a backend process basically handles all queries issued by the connected client. This backend consists of five subsystems, as shown below:
PostgreSQL 中使用了多个 Background Worker 来并发地处理查询请求
-
Parser,Parser 会从普通的 SQL 文本中生成解析树。
-
Analyzer/Analyser,对提取树进行语义化分析,生成查询树。
-
Rewriter, The rewriter transforms a query tree using the rules stored in the rule system if such rules exist.
Planner The planner generates the plan tree that can most effectively be executed from the query tree.
Executor The executor executes the query via accessing the tables and indexes in the order that was created by the plan tree.
Parser
The parser generates a parse tree that can be read by subsequent subsystems from an SQL statement in plain text. A parse tree is a tree whose root node is the SelectStmt structure defined in parsenodes.h.
Due to the fact that the parser only checks the syntax of an input when generating a parse tree, it only returns an error if there is a syntax error in the query. The parser does not check the semantics of an input query. For example, even if the query contains a table name that does not exist, the parser does not return an error. Semantic checks are done by the analyzer/analyser.
Analyzer/Analyser
The analyzer/analyser runs a semantic analysis of a parse tree generated by the parser and generates a query tree.
The root of a query tree is the Query structure defined in parsenodes.h; this structure contains metadata of its corresponding query such as the type of this command (SELECT, INSERT or others) and several leaves; each leaf forms a list or a tree and holds data of the individual particular clause.
Rewriter
The rewriter is the system that realizes the rule system, and transforms a query tree according to the rules stored in the pg_rules system catalog if necessary.
Views in PostgreSQL are implemented by using the rule system. When a view is defined by the CREATE VIEW command, the corresponding rule is automatically generated and stored in the catalog.
sampledb=# CREATE VIEW employees_list
sampledb-# AS SELECT e.id, e.name, d.name AS department
sampledb-# FROM employees AS e, departments AS d WHERE e.department_id = d.id;
When a query that contains a view shown below is issued, the parser creates the parse tree as shown in Fig. 3.4(a).
sampledb=# SELECT * FROM employees_list;
Planner and Executor
The planner receives a query tree from the rewriter and generates a (query) plan tree that can be processed by the executor most effectively.
The planner in PostgreSQL is based on pure cost-based optimization; it does not support rule-based optimization and hints. This planner is the most complex subsystem in RDBMS;
A plan tree is composed of elements called plan nodes, and it is connected to the plantree list of the PlannedStmt structure.
Each plan node has information that the executor requires for processing, and the executor processes from the end of the plan tree to the root in the case of a single-table query.
For example, the plan tree shown in Fig. 3.5 is a list of a sort node and a sequential scan node; thus, the executor scans the table:tbl_a by a sequential scan and then sorts the obtained result.
The executor reads and writes tables and indexes in the database cluster via the buffer manager described in Chapter 8. When processing a query, the executor uses some memory areas, such as temp_buffers and work_mem, allocated in advance and creates temporary files if necessary.
In addition, when accessing tuples, PostgreSQL uses the concurrency control mechanism to maintain consistency and isolation of the running transactions.