docs: update README (#280)

* docs: update readme.md

* docs: update readme

* docs: update docs
This commit is contained in:
Hayden 2024-03-28 19:39:06 +08:00 committed by GitHub
parent c6615aecac
commit ddf6c8fd8f
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 258 additions and 241 deletions

View File

@ -13,39 +13,27 @@
[English](./README.md) | 简体中文 [English](./README.md) | 简体中文
dt-sql-parser 是一个基于 [ANTLR4](https://github.com/antlr/antlr4) 开发的, 针对大数据领域的 **SQL Parser** 项目。通过[ANTLR4](https://github.com/antlr/antlr4) 生成的 Parser、Visitor 和 Listener我们可以轻松的做到对 SQL 语句的**语法检查**Syntax Validation、**词法分析**Tokenizer)、 **遍历 AST** 节点等功能。此外,还提供了一些辅助方法, 例如 **SQL 切割Split**、**自动补全**等。 dt-sql-parser 是一个基于 [ANTLR4](https://github.com/antlr/antlr4) 开发的, 针对大数据领域的 **SQL Parser** 项目。通过[ANTLR4](https://github.com/antlr/antlr4) 生成的 Parser、Visitor 和 Listener我们可以轻松的做到对 SQL 语句的 **词法分析**Lexer)、**语法分析**Parser、**遍历 AST** 节点等功能。
此外,还提供了一些高级功能,例如 **SQL 校验****自动补全**、**收集表名字段名** 等。
**已支持的 SQL 类型:** **已支持的 SQL 类型:**
- MySQL - MySQL
- Flink SQL - Flink
- Spark SQL - Spark
- Hive SQL - Hive
- PostgreSQL - PostgreSQL
- Trino SQL - Trino
- Impala SQL - Impala
**SQL 辅助方法支持** > 提示:当前所有的 SQL Parser 是 `Typescript` 语言版本,如果有需要,可以尝试编译 Grammar 文件到其他目标语言。
| SQL 类型 | SQL 切割 | 自动补全 |
| ----------- | -------- | -------- |
| MySQL | ✅ | ✅ |
| Flink SQL | ✅ | ✅ |
| Spark SQL | ✅ | ✅ |
| Hive SQL | ✅ | ✅ |
| PostgreSQL | ✅ | ✅ |
| Trino SQL | ✅ | ✅ |
| Impala SQL | ✅ | ✅ |
> 提示:当前的 Parser 是 `Javascript` 语言版本,如果有必要,可以尝试编译 Grammar 文件到其他目标语言。
<br/> <br/>
## 与 MonacoEditor 集成 ## 与 MonacoEditor 集成
我们提供了[monaco-sql-languages](https://github.com/DTStack/monaco-sql-languages),通过它你可以轻易的将`dt-sql-parser`与`monaco-editor`集成。 我们提供了[monaco-sql-languages](https://github.com/DTStack/monaco-sql-languages),通过它你可以轻易的将`dt-sql-parser`与`monaco-editor`集成。
>提示:如果想要在浏览器中运行 `dt-sql-parser`,请不要忘记安装 `assert``util` 的 polyfills 包,另外还需要定义全局变量 `process.env`。 在 node 环境中则不需要,因为 node 内置了这些。
<br/> <br/>
## 安装 ## 安装
@ -61,48 +49,33 @@ yarn add dt-sql-parser
<br/> <br/>
## 使用 ## 使用
在开始使用前,需要先了解基本用法。`dt-sql-parser` 为不同类型的 SQL 分别提供相应的 SQL Parser 类: 在开始使用前,需要先了解基本用法。`dt-sql-parser` 为不同类型的 SQL 分别提供相应的 SQL 类:
```javascript ```typescript
import { MySQL, FlinkSQL, SparkSQL, HiveSQL, PostgresSQL, TrinoSQL, ImpalaSQL } from 'dt-sql-parser'; import { MySQL, FlinkSQL, SparkSQL, HiveSQL, PostgreSQL, TrinoSQL, ImpalaSQL } from 'dt-sql-parser';
``` ```
在使用语法校验,自动补全等功能之前,需要先实例化对应 SQL 类型的 Parser,以 `MySQL` 为例: 在使用语法校验,自动补全等功能之前,需要先实例化对应 SQL 类,以 `MySQL` 为例:
```javascript ```typescript
const parser = new MySQL(); const mysql = new MySQL();
``` ```
下文中的使用示例将使用 `MySQL`,其他 SQL 类型的 Parser 使用方式与`MySQL` 相同。 下文中的使用示例将使用 `MySQL`,其他 SQL 类型的 Parser 使用方式与`MySQL` 相同。
### 语法校验Syntax Validation ### 语法校验Syntax Validation
```javascript 先实例化 SQL 类,然后调用 SQL 实例上的 `validate` 方法对 SQL 语句进行校验,如果校验失败,则返回一个包含 `error` 信息的数组。
```typescript
import { MySQL } from 'dt-sql-parser'; import { MySQL } from 'dt-sql-parser';
const parser = new MySQL(); const mysql = new MySQL();
const incorrectSql = 'selec id,name from user1;';
const errors = mysql.validate(incorrectSql);
const correctSql = 'select id,name from user1;';
const errors = parser.validate(correctSql);
console.log(errors); console.log(errors);
``` ```
*输出:* *输出:*
```javascript ```typescript
/*
[]
*/
```
**校验失败示例:**
```javascript
const incorrectSql = 'selec id,name from user1;'
const errors = parser.validate(incorrectSql);
console.log(errors);
```
*输出:*
```javascript
/* /*
[ [
{ {
@ -110,30 +83,30 @@ console.log(errors);
endLine: 1, endLine: 1,
startCol: 0, startCol: 0,
startLine: 1, startLine: 1,
message: "mismatched input 'SELEC' expecting {<EOF>, 'ALTER', 'ANALYZE', 'CALL', 'CHANGE', 'CHECK', 'CREATE', 'DELETE', 'DESC', 'DESCRIBE', 'DROP', 'EXPLAIN', 'GET', 'GRANT', 'INSERT', 'KILL', 'LOAD', 'LOCK', 'OPTIMIZE', 'PURGE', 'RELEASE', 'RENAME', 'REPLACE', 'RESIGNAL', 'REVOKE', 'SELECT', 'SET', 'SHOW', 'SIGNAL', 'UNLOCK', 'UPDATE', 'USE', 'BEGIN', 'BINLOG', 'CACHE', 'CHECKSUM', 'COMMIT', 'DEALLOCATE', 'DO', 'FLUSH', 'HANDLER', 'HELP', 'INSTALL', 'PREPARE', 'REPAIR', 'RESET', 'ROLLBACK', 'SAVEPOINT', 'START', 'STOP', 'TRUNCATE', 'UNINSTALL', 'XA', 'EXECUTE', 'SHUTDOWN', '--', '(', ';'}" message: "...“
} }
] ]
*/ */
``` ```
先实例化 Parser 对象,然后使用 `validate` 方法对 SQL 语句进行校验,如果校验失败,则返回一个包含 `error` 信息的数组。
### 词法分析Tokenizer ### 词法分析Tokenizer
部分场景下,可以通过 `getAllTokens` 单独对 SQL 语句进行词法分析,获取所有的 Tokens 对象: 通过调用 SQL 实例上的 `getAllTokens`方法,可以对 SQL 语句进行词法分析,获取所有的 Tokens 对象:
```javascript ```typescript
import { MySQL } from 'dt-sql-parser'; import { MySQL } from 'dt-sql-parser';
const parser = new MySQL() const mysql = new MySQL();
const sql = 'select id,name,sex from user1;' const sql = 'select id,name,sex from user1;'
const tokens = parser.getAllTokens(sql) const tokens = mysql.getAllTokens(sql);
console.log(tokens)
console.log(tokens);
``` ```
*输出:* *输出:*
```javascript ```typescript
/* /*
[ [
{ {
@ -154,52 +127,40 @@ console.log(tokens)
### 访问者模式Visitor ### 访问者模式Visitor
使用 Visitor 模式访问 AST 中的指定节点 使用 Visitor 模式访问 AST 中的指定节点,并计算出结果:
```typescript ```typescript
import { MySQL, AbstractParseTreeVisitor } from 'dt-sql-parser'; import { MySQL, MySqlParserVisitor } from 'dt-sql-parser';
import type { MySqlParserVisitor } from 'dt-sql-parser';
const parser = new MySQL(); const mysql = new MySQL();
const sql = `select id,name from user1;`; const sql = `select id, name from user1;`;
const tree = parser.parse(sql); const parseTree = mysql.parse(sql);
type Result = string; class MyVisitor extends MySqlParserVisitor<string> {
defaultResult(): string {
class MyVisitor extends AbstractParseTreeVisitor<Result> implements MySqlParserVisitor<Result> {
protected defaultResult() {
return ''; return '';
} }
visitTableName(ctx) { aggregateResult(aggregate: string, nextResult: string): string {
let tableName = ctx.text.toLowerCase(); return aggregate + nextResult;
console.log('TableName', tableName);
return '';
}
visitSelectElements(ctx) {
let selectElements = ctx.text.toLowerCase();
console.log('SelectElements', selectElements);
return '';
}
visitProgram(ctx) { // program 是根规则
this.visitChildren(ctx);
return 'Return by program context'
} }
visitProgram = (ctx) => {
return this.visitChildren(ctx);
};
visitTableName = (ctx) => {
return ctx.getText();
};
} }
const visitor = new MyVisitor(); const visitor = new MyVisitor();
const result = visitor.visit(tree); const result = visitor.visit(parseTree);
console.log(result); console.log(result);
``` ```
*输出:* *输出:*
```javascript ```typescript
/* /*
SelectElements id,name user1
TableName user1
*/
/*
Return by program node
*/ */
``` ```
@ -210,51 +171,48 @@ Return by program node
Listener 模式,利用 [ANTLR4](https://github.com/antlr/antlr4) 提供的 `ParseTreeWalker` 对象遍历 AST进入各个节点时调用对应的方法。 Listener 模式,利用 [ANTLR4](https://github.com/antlr/antlr4) 提供的 `ParseTreeWalker` 对象遍历 AST进入各个节点时调用对应的方法。
```typescript ```typescript
import { MySQL } from 'dt-sql-parser'; import { MySQL, MySqlParserListener } from 'dt-sql-parser';
import type { MySqlParserListener } from 'dt-sql-parser';
const parser = new MySQL(); const mysql = new MySQL();
const sql = 'select id,name from user1;'; const sql = 'select id, name from user1;';
const parseTree = parser.parse(sql); const parseTree = mysql.parse(sql);
class MyListener implements MySqlParserListener { class MyListener extends MySqlParserListener {
enterTableName(ctx) { result = '';
let tableName = ctx.text.toLowerCase(); enterTableName = (ctx): void => {
console.log('TableName:', tableName); this.result = ctx.getText();
} };
enterSelectElements(ctx) {
let selectElements = ctx.text.toLowerCase();
console.log('SelectElements:', selectElements);
}
} }
const listenTableName = new MyListener();
parser.listen(listenTableName as MySqlParserListener, parseTree); const listener = new MyListener();
mysql.listen(listener, parseTree);
console.log(listener.result)
``` ```
*输出:* *输出:*
```javascript ```typescript
/* /*
SelectElements id,name user1
TableName user1
*/ */
``` ```
> 提示:使用 Listener 模式时,节点的方法名称可以在对应 SQL 目录下的 Listener 文件中查找
### SQL 按语句切割 ### SQL 按语句切割
`FlinkSQL` 为例: 调用 SQL 实例上的 `splitSQLByStatement` 方法,以 `FlinkSQL` 为例:
```javascript ```typescript
import { FlinkSQL } from 'dt-sql-parser'; import { FlinkSQL } from 'dt-sql-parser';
const parser = new FlinkSQL();
const flink = new FlinkSQL();
const sql = 'SHOW TABLES;\nSELECT * FROM tb;'; const sql = 'SHOW TABLES;\nSELECT * FROM tb;';
const sqlSlices = parser.splitSQLByStatement(sql); const sqlSlices = flink.splitSQLByStatement(sql);
console.log(sqlSlices) console.log(sqlSlices)
``` ```
*输出:* *输出:*
```javascript ```typescript
/* /*
[ [
{ {
@ -280,40 +238,44 @@ console.log(sqlSlices)
``` ```
### 自动补全Code Completion ### 自动补全Code Completion
在 sql 文本的指定位置上获取自动补全信息,以 `FlinkSQL` 为例: 在 sql 文本的指定位置上获取自动补全信息,以 `FlinkSQL` 为例,调用 SQL 实例上的 `getSuggestionAtCaretPosition` 方法,传入 sql 文本和指定位置的行列号:
> 下文中有一些关于[自动补全位置](#自动补全功能的-caretposition)的补充说明。
调用 `getSuggestionAtCaretPosition` 方法,传入 sql 内容和指定位置的行列号,下文中有一些关于[自动补全位置](#自动补全功能的-caretposition)的补充说明。
+ **获取关键字候选项列表** + **获取关键字候选项列表**
```javascript ```typescript
import { FlinkSQL } from 'dt-sql-parser'; import { FlinkSQL } from 'dt-sql-parser';
const parser = new FlinkSQL();
const flink = new FlinkSQL();
const sql = 'CREATE '; const sql = 'CREATE ';
const pos = { lineNumber: 1, column: 16 }; // 最后一个位置 const pos = { lineNumber: 1, column: 16 }; // 最后一个位置
const keywords = parser.getSuggestionAtCaretPosition(sql, pos)?.keywords; const keywords = flink.getSuggestionAtCaretPosition(sql, pos)?.keywords;
console.log(keywords); console.log(keywords);
``` ```
*输出:* *输出:*
```javascript ```typescript
/* /*
[ 'CATALOG', 'FUNCTION', 'TEMPORARY', 'VIEW', 'DATABASE', 'TABLE' ] [ 'CATALOG', 'FUNCTION', 'TEMPORARY', 'VIEW', 'DATABASE', 'TABLE' ]
*/ */
``` ```
+ **获取语法相关自动补全信息** + **获取语法相关自动补全信息**
```javascript ```typescript
const parser = new FlinkSQL(); import { FlinkSQL } from 'dt-sql-parser';
const flink = new FlinkSQL();
const sql = 'SELECT * FROM tb'; const sql = 'SELECT * FROM tb';
const pos = { lineNumber: 1, column: 16 }; // tb 的后面 const pos = { lineNumber: 1, column: 16 }; // tb 的后面
const syntaxSuggestions = parser.getSuggestionAtCaretPosition(sql, pos)?.syntax; const syntaxSuggestions = flink.getSuggestionAtCaretPosition(sql, pos)?.syntax;
console.log(syntaxSuggestions); console.log(syntaxSuggestions);
``` ```
*输出:* *输出:*
```javascript ```typescript
/* /*
[ [
{ {
@ -347,6 +309,53 @@ console.log(sqlSlices)
``` ```
语法相关自动补全信息返回一个数组,数组中每一项代表该位置可以填写什么语法,比如上例中的输出结果代表该位置可以填写**表名**或者**视图名称**。其中 `syntaxContextType` 是可以补全的语法类型,`wordRanges` 是已经填写的内容。 语法相关自动补全信息返回一个数组,数组中每一项代表该位置可以填写什么语法,比如上例中的输出结果代表该位置可以填写**表名**或者**视图名称**。其中 `syntaxContextType` 是可以补全的语法类型,`wordRanges` 是已经填写的内容。
### 获取 SQL 中出现的实体(表名、字段名等)
调用 SQL 实例上的 `getAllEntities` 方法,传入 sql 文本和指定位置的行列号即可轻松获取。
```typescript
import { FlinkSQL } from 'dt-sql-parser';
const flink = new FlinkSQL();
const sql = 'SELECT * FROM tb;';
const pos = { lineNumber: 1, column: 16 }; // tb 的后面
const entities = flink.getAllEntities(sql, pos);
console.log(entities);
```
*输出*
```typescript
/*
[
{
entityContextType: 'table',
text: 'tb',
position: {
line: 1,
startIndex: 14,
endIndex: 15,
startColumn: 15,
endColumn: 17
},
belongStmt: {
stmtContextType: 'selectStmt',
position: [Object],
rootStmt: [Object],
parentStmt: [Object],
isContainCaret: true
},
relatedEntities: null,
columns: null,
isAlias: false,
origin: null,
alias: null
}
]
*/
```
行列号信息不是必传的,如果传了行列号信息,那么收集到的实体中,如果实体位于对应行列号所在的语句下,那么实体的所属的语句对象上会带有 `isContainCaret` 标识,这在与自动补全功能结合时,可以帮助你快速筛选出需要的实体信息。
### 其他 API ### 其他 API
- `createLexer` 创建一个 Antlr4 Lexer 实例并返回; - `createLexer` 创建一个 Antlr4 Lexer 实例并返回;
@ -365,7 +374,7 @@ console.log(sqlSlices)
对于一个索引范围,起始索引从 0 开始,以 n-1 结束,如上图中,一个圈定蓝色文本的索引范围应该这样表示: 对于一个索引范围,起始索引从 0 开始,以 n-1 结束,如上图中,一个圈定蓝色文本的索引范围应该这样表示:
```javascript ```typescript
{ {
startIndex: 0, startIndex: 0,
endIndex: 3 endIndex: 3
@ -378,7 +387,7 @@ console.log(sqlSlices)
![line-image](./docs/images/line.png) ![line-image](./docs/images/line.png)
对于一个圈定多行的范围,行号从 1 开始,以 n 结束,一个圈定第一行和第二行的范围这样表示: 对于一个圈定多行的范围,行号从 1 开始,以 n 结束,一个圈定第一行和第二行的范围这样表示:
```javascript ```typescript
{ {
startLine: 1, startLine: 1,
endLine: 2 endLine: 2
@ -392,7 +401,7 @@ console.log(sqlSlices)
将列数类比为编辑器的光标位置会更加容易理解。对于一个圈定多列的范围,列数从 1 开始,以 n+1 结束,如上图中,一个圈定蓝色文本的列数范围这样表示: 将列数类比为编辑器的光标位置会更加容易理解。对于一个圈定多列的范围,列数从 1 开始,以 n+1 结束,如上图中,一个圈定蓝色文本的列数范围这样表示:
```javascript ```typescript
{ {
startColumn: 1, startColumn: 1,
endColumn: 5 endColumn: 5
@ -404,14 +413,14 @@ dt-sql-parser 的自动补全功能在设计之初就是为了在编辑器中使
但是在一些其他场景下,你可能需要通过转换或者计算来得到自动补全功能所需要的位置信息,那么在此之前,有一些注意事项可能是你需要关心的。 但是在一些其他场景下,你可能需要通过转换或者计算来得到自动补全功能所需要的位置信息,那么在此之前,有一些注意事项可能是你需要关心的。
dt-sql-parser 的自动补全功能依赖于 [antlr4-c3](https://github.com/mike-lischke/antlr4-c3), 这是一个很棒的库。dt-sql-parser 的自动补全功能只是基于 antlr4-c3 做了一些封装和转换,包括将行列号信息转换成 antlr4-c3 需要的 token 索引,以下图为例: dt-sql-parser 的自动补全功能依赖于 [antlr4-c3](https://github.com/mike-lischke/antlr4-c3)这是一个很棒的库。dt-sql-parser 的自动补全功能只是基于 antlr4-c3 做了一些封装和转换,包括将行列号信息转换成 antlr4-c3 需要的 token 索引,以下图为例:
![column-image](./docs/images/token.png) ![column-image](./docs/images/token.png)
将图中的 column 视作为光标位置,这段文本放到编辑器中,会得到 13 个可能的光标位置,而对于 dt-sql-parser 来说,这段文本被解析后会生成 4 个 Token。自动补全功能的一个重要策略是**当光标(自动补全位置)还没有完全离开某个 Token 时dt-sql-parser 就认为这个 Token 还没有完成,自动补全功能将会去推断这个 Token 所在的位置可以填什么。** 将图中的 column 视作为光标位置,这段文本放到编辑器中,会得到 13 个可能的光标位置,而对于 dt-sql-parser 来说,这段文本被解析后会生成 4 个 Token。自动补全功能的一个重要策略是**当光标(自动补全位置)还没有完全离开某个 Token 时dt-sql-parser 就认为这个 Token 还没有完成,自动补全功能将会去推断这个 Token 所在的位置可以填什么。**
举个例子,如果想要通过自动补全功能知道 `SHOW` 后面应该填什么, 那么对应的位置信息应该是: 举个例子,如果想要通过自动补全功能知道 `SHOW` 后面应该填什么, 那么对应的位置信息应该是:
```javascript ```typescript
{ {
lineNumber: 1, lineNumber: 1,
column: 6 column: 6
@ -424,6 +433,7 @@ dt-sql-parser 的自动补全功能依赖于 [antlr4-c3](https://github.com/mike
<br/> <br/>
## 许可证 ## 许可证
[MIT](./LICENSE) [MIT](./LICENSE)

231
README.md
View File

@ -13,42 +13,27 @@
English | [简体中文](./README-zh_CN.md) English | [简体中文](./README-zh_CN.md)
dt-sql-parser is a **SQL Parser** project built with [ANTLR4](https://github.com/antlr/antlr4), and it's mainly for the **BigData** field. The [ANTLR4](https://github.com/antlr/antlr4) generated the basic Parser, Visitor, and Listener, so it's easy to complete the **syntax validation**, **tokenizer**, **traverse** the AST, and so on features. dt-sql-parser is a **SQL Parser** project built with [ANTLR4](https://github.com/antlr/antlr4), and it's mainly for the **BigData** field. The [ANTLR4](https://github.com/antlr/antlr4) generated the basic Parser, Visitor, and Listener, so it's easy to complete the **Lexer**, **Parser**, **traverse the AST**, and so on features.
Additionally, it provides auxiliary functions such as **SQL splitting** and **code completion**. Additionally, it provides advanced features such as **SQL Validation**, **Code Completion** and **Collecting Table and Columns in SQL**.
**Supported SQL**: **Supported SQL**:
- MySQL - MySQL
- Flink SQL - Flink
- Spark SQL - Spark
- Hive SQL - Hive
- PostgreSQL - PostgreSQL
- Trino SQL - Trino
- Impala SQL - Impala
**Supported auxiliary methods** >Tips: This project is the default for Typescript target, also you can try to compile it to other languages if you need.
| SQL Type | SQL Spliting | Code Completion |
| ----------- | ------------ | --------------- |
| MySQL | ✅ | ✅ |
| Flink SQL | ✅ | ✅ |
| Spark SQL | ✅ | ✅ |
| Hive SQL | ✅ | ✅ |
| PostgreSQL | ✅ | ✅ |
| Trino SQL | ✅ | ✅ |
| Impala SQL | ✅ | ✅ |
>Tips: This project is the default for Javascript language, also you can try to compile it to other languages if you need.
<br/> <br/>
## Integrating SQL Parser with Monaco Editor ## Integrating SQL Parser with Monaco Editor
We have provided [monaco-sql-languages](https://github.com/DTStack/monaco-sql-languages), it is easily to integrate with `monaco-editor`. We also have provided [monaco-sql-languages](https://github.com/DTStack/monaco-sql-languages) to easily to integrate `dt-sql-parser` with `monaco-editor`.
>Tips: If you want to run `dt-sql-parser` in browser, don't forget to install the `assert` and `util` polyfills, and define the global variable `process.env`.
None of this is needed in the node environment, because node has them built-in.
<br/> <br/>
@ -65,49 +50,34 @@ yarn add dt-sql-parser
<br/> <br/>
## Usage ## Usage
We recommend learning the Fundamentals usage before continuing. The dt-sql-parser library provides SQL Parser classes for different types of SQL. We recommend learning the fundamentals usage before continuing. The dt-sql-parser library provides SQL classes for different types of SQL.
```javascript ```javascript
import { MySQL, FlinkSQL, SparkSQL, HiveSQL, PostgresSQL, TrinoSQL, ImpalaSQL } from 'dt-sql-parser'; import { MySQL, FlinkSQL, SparkSQL, HiveSQL, PostgreSQL, TrinoSQL, ImpalaSQL } from 'dt-sql-parser';
``` ```
Before using syntax validation, code completion, and other features, it is necessary to instantiate the Parser of the relevant SQL type. Before using syntax validation, code completion, and other features, it is necessary to instantiate the Parser of the relevant SQL type.
For instance, one can consider using `MySQL` as an example: For instance, one can consider using `MySQL` as an example:
```javascript ```javascript
const parser = new MySQL(); const mysql = new MySQL();
``` ```
The following usage examples will utilize the `MySQL`, and the Parser for other SQL types will be used in a similar manner as `MySQL`. The following usage examples will utilize the `MySQL`, and the Parser for other SQL types will be used in a similar manner as `MySQL`.
### Syntax Validation ### Syntax Validation
First instanced a Parser object, then call the **validate** method on the SQL instance to validate the sql content, if failed returns an array includes **error** message.
```javascript ```javascript
import { MySQL } from 'dt-sql-parser'; import { MySQL } from 'dt-sql-parser';
const parser = new MySQL(); const mysql = new MySQL();
const incorrectSql = 'selec id,name from user1;';
const errors = mysql.validate(incorrectSql);
const correctSql = 'select id,name from user1;';
const errors = parser.validate(correctSql);
console.log(errors); console.log(errors);
``` ```
*output:* *output:*
```javascript
/*
[]
*/
```
**Validate failed:**
```javascript
const incorrectSql = 'selec id,name from user1;'
const errors = parser.validate(incorrectSql);
console.log(errors);
```
*output:*
```javascript ```javascript
/* /*
[ [
@ -116,25 +86,24 @@ console.log(errors);
endLine: 1, endLine: 1,
startCol: 0, startCol: 0,
startLine: 1, startLine: 1,
message: "mismatched input 'SELEC' expecting {<EOF>, 'ALTER', 'ANALYZE', 'CALL', 'CHANGE', 'CHECK', 'CREATE', 'DELETE', 'DESC', 'DESCRIBE', 'DROP', 'EXPLAIN', 'GET', 'GRANT', 'INSERT', 'KILL', 'LOAD', 'LOCK', 'OPTIMIZE', 'PURGE', 'RELEASE', 'RENAME', 'REPLACE', 'RESIGNAL', 'REVOKE', 'SELECT', 'SET', 'SHOW', 'SIGNAL', 'UNLOCK', 'UPDATE', 'USE', 'BEGIN', 'BINLOG', 'CACHE', 'CHECKSUM', 'COMMIT', 'DEALLOCATE', 'DO', 'FLUSH', 'HANDLER', 'HELP', 'INSTALL', 'PREPARE', 'REPAIR', 'RESET', 'ROLLBACK', 'SAVEPOINT', 'START', 'STOP', 'TRUNCATE', 'UNINSTALL', 'XA', 'EXECUTE', 'SHUTDOWN', '--', '(', ';'}" message: "..."
} }
] ]
*/ */
``` ```
We instanced a Parser object, and use the **validate** method to check the SQL syntax, if failed
returns an array object includes **error** message.
### Tokenizer ### Tokenizer
Get all **tokens** by the Parser: Call the `getAllTokens` method on the SQL instance:
```javascript ```javascript
import { MySQL } from 'dt-sql-parser'; import { MySQL } from 'dt-sql-parser';
const parser = new MySQL() const mysql = new MySQL()
const sql = 'select id,name,sex from user1;' const sql = 'select id,name,sex from user1;'
const tokens = parser.getAllTokens(sql) const tokens = mysql.getAllTokens(sql)
console.log(tokens) console.log(tokens)
``` ```
@ -164,36 +133,28 @@ console.log(tokens)
Traverse the tree node by the Visitor: Traverse the tree node by the Visitor:
```typescript ```typescript
import { MySQL, AbstractParseTreeVisitor } from 'dt-sql-parser'; import { MySQL, MySqlParserVisitor } from 'dt-sql-parser';
import type { MySqlParserVisitor } from 'dt-sql-parser';
const parser = new MySQL(); const mysql = new MySQL();
const sql = `select id,name from user1;`; const sql = `select id, name from user1;`;
const tree = parser.parse(sql); const parseTree = mysql.parse(sql);
type Result = string; class MyVisitor extends MySqlParserVisitor<string> {
defaultResult(): string {
class MyVisitor extends AbstractParseTreeVisitor<Result> implements MySqlParserVisitor<Result> {
protected defaultResult() {
return ''; return '';
} }
visitTableName(ctx) { aggregateResult(aggregate: string, nextResult: string): string {
let tableName = ctx.text.toLowerCase(); return aggregate + nextResult;
console.log('TableName:', tableName);
return '';
}
visitSelectElements(ctx) {
let selectElements = ctx.text.toLowerCase();
console.log('SelectElements:', selectElements);
return '';
}
visitProgram(ctx) { // program is root rule
this.visitChildren(ctx);
return 'Return by program context'
} }
visitProgram = (ctx) => {
return this.visitChildren(ctx);
};
visitTableName = (ctx) => {
return ctx.getText();
};
} }
const visitor = new MyVisitor(); const visitor = new MyVisitor();
const result = visitor.visit(tree); const result = visitor.visit(parseTree);
console.log(result); console.log(result);
``` ```
@ -202,60 +163,52 @@ console.log(result);
```javascript ```javascript
/* /*
SelectElements: id,name user1
TableName: user1
*/
/*
Return by program node
*/ */
``` ```
> Tips: The node's method name can be found in the Visitor file under the corresponding SQL directory
### Listener ### Listener
Access the specified node in the AST by the Listener Access the specified node in the AST by the Listener
```typescript ```typescript
import { MySQL } from 'dt-sql-parser'; import { MySQL, MySqlParserListener } from 'dt-sql-parser';
import type { MySqlParserListener } from 'dt-sql-parser';
const parser = new MySQL(); const mysql = new MySQL();
const sql = 'select id,name from user1;'; const sql = 'select id, name from user1;';
const parseTree = parser.parse(sql); const parseTree = mysql.parse(sql);
class MyListener implements MySqlParserListener { class MyListener extends MySqlParserListener {
enterTableName(ctx) { result = '';
let tableName = ctx.text.toLowerCase(); enterTableName = (ctx): void => {
console.log('TableName:', tableName); this.result = ctx.getText();
} };
enterSelectElements(ctx) {
let selectElements = ctx.text.toLowerCase();
console.log('SelectElements:', selectElements);
}
} }
const listenTableName = new MyListener();
parser.listen(listenTableName as MySqlParserListener, parseTree); const listener = new MyListener();
mysql.listen(listener, parseTree);
console.log(listener.result)
``` ```
*output:* *output:*
```javascript ```javascript
/* /*
SelectElements: id,name user1
TableName: user1
*/ */
``` ```
> Tips: The node's method name can be found in the Listener file under the corresponding SQL directory
### Splitting SQL statements ### Splitting SQL statements
Take `FlinkSQL` as an example: Take `FlinkSQL` as an example, call the `splitSQLByStatement` method on the SQL instance:
```javascript ```javascript
import { FlinkSQL } from 'dt-sql-parser'; import { FlinkSQL } from 'dt-sql-parser';
const parser = new FlinkSQL();
const flink = new FlinkSQL();
const sql = 'SHOW TABLES;\nSELECT * FROM tb;'; const sql = 'SHOW TABLES;\nSELECT * FROM tb;';
const sqlSlices = parser.splitSQLByStatement(sql); const sqlSlices = flink.splitSQLByStatement(sql);
console.log(sqlSlices) console.log(sqlSlices)
``` ```
@ -288,17 +241,18 @@ console.log(sqlSlices)
### Code Completion ### Code Completion
Obtaining code completion information at a specified position in SQL. Obtaining code completion information at a specified position in SQL.
We can refer to the example of using `FlinkSQL`.
Invoke the `getSuggestionAtCaretPosition` method, pass the SQL content and the row and column numbers indicating the position where code completion is desired. The following are some additional explanations about [CaretPosition](#caretposition-of-code-completion). Call the `getAllEntities` method on the SQL instance, pass the SQL content and the row and column numbers indicating the position where code completion is desired. The following are some additional explanations about [CaretPosition](#caretposition-of-code-completion).
+ **keyword candidates list** + **keyword candidates list**
```javascript ```javascript
import { FlinkSQL } from 'dt-sql-parser'; import { FlinkSQL } from 'dt-sql-parser';
const parser = new FlinkSQL();
const flink = new FlinkSQL();
const sql = 'CREATE '; const sql = 'CREATE ';
const pos = { lineNumber: 1, column: 16 }; // the end position const pos = { lineNumber: 1, column: 16 }; // the end position
const keywords = parser.getSuggestionAtCaretPosition(sql, pos)?.keywords; const keywords = flink.getSuggestionAtCaretPosition(sql, pos)?.keywords;
console.log(keywords); console.log(keywords);
``` ```
*output:* *output:*
@ -309,10 +263,13 @@ Invoke the `getSuggestionAtCaretPosition` method, pass the SQL content and the r
``` ```
+ **Obtaining information related to grammar completion** + **Obtaining information related to grammar completion**
```javascript ```javascript
const parser = new FlinkSQL(); import { FlinkSQL } from 'dt-sql-parser';
const flink = new FlinkSQL();
const sql = 'SELECT * FROM tb'; const sql = 'SELECT * FROM tb';
const pos = { lineNumber: 1, column: 16 }; // after 'tb' const pos = { lineNumber: 1, column: 16 }; // after 'tb'
const syntaxSuggestions = parser.getSuggestionAtCaretPosition(sql, pos)?.syntax; const syntaxSuggestions = flink.getSuggestionAtCaretPosition(sql, pos)?.syntax;
console.log(syntaxSuggestions); console.log(syntaxSuggestions);
``` ```
*output:* *output:*
@ -350,6 +307,55 @@ Invoke the `getSuggestionAtCaretPosition` method, pass the SQL content and the r
``` ```
The grammar-related code completion information returns an array, where each item represents what grammar can be filled in at that position. For example, the output in the above example represents that the position can be filled with either a **table name** or **a view name**. In this case, `syntaxContextType` represents the type of grammar that can be completed, and `wordRanges` represents the content that has already been filled. The grammar-related code completion information returns an array, where each item represents what grammar can be filled in at that position. For example, the output in the above example represents that the position can be filled with either a **table name** or **a view name**. In this case, `syntaxContextType` represents the type of grammar that can be completed, and `wordRanges` represents the content that has already been filled.
### Get all entities in SQL (e.g. table, column)
Call the `getAllEntities` method on the SQL instance, and pass in the sql text and the row and column numbers at the specified location to easily get them.
```typescript
import { FlinkSQL } from 'dt-sql-parser';
const flink = new FlinkSQL();
const sql = 'SELECT * FROM tb;';
const pos = { lineNumber: 1, column: 16 }; // tb 的后面
const entities = flink.getAllEntities(sql, pos);
console.log(entities);
```
*output*
```typescript
/*
[
{
entityContextType: 'table',
text: 'tb',
position: {
line: 1,
startIndex: 14,
endIndex: 15,
startColumn: 15,
endColumn: 17
},
belongStmt: {
stmtContextType: 'selectStmt',
position: [Object],
rootStmt: [Object],
parentStmt: [Object],
isContainCaret: true
},
relatedEntities: null,
columns: null,
isAlias: false,
origin: null,
alias: null
}
]
*/
```
Position is not required, if the position is passed, then in the collected entities, if the entity is located under the statement where the corresponding position is located, then the statement object to which the entity belongs will be marked with `isContainCaret`, which can help you quickly filter out the required entities when combined with the code completion function.
### Other API ### Other API
- `createLexer` Create an instance of Antlr4 Lexer and return it; - `createLexer` Create an instance of Antlr4 Lexer and return it;
@ -428,6 +434,7 @@ At this time, dt-sql-parser will think that `SHOW` is already a complete Token,
For the editor, this strategy is also more intuitive. After the user enters `SHOW`, before pressing the space key, the user probably has not finished entering, maybe the user wants to enter something like `SHOWS`. When the user presses the space key, the editor thinks that the user wants to enter the next Token, and it is time to ask dt-sql-parser what can be filled in the next Token position. For the editor, this strategy is also more intuitive. After the user enters `SHOW`, before pressing the space key, the user probably has not finished entering, maybe the user wants to enter something like `SHOWS`. When the user presses the space key, the editor thinks that the user wants to enter the next Token, and it is time to ask dt-sql-parser what can be filled in the next Token position.
<br/> <br/>
## License ## License
[MIT](./LICENSE) [MIT](./LICENSE)