[doc] support ngram_search function (#899)
https://github.com/apache/doris/pull/38226
diff --git a/docs/sql-manual/sql-functions/string-functions/ngram-search.md b/docs/sql-manual/sql-functions/string-functions/ngram-search.md
new file mode 100644
index 0000000..ae42731
--- /dev/null
+++ b/docs/sql-manual/sql-functions/string-functions/ngram-search.md
@@ -0,0 +1,67 @@
+---
+{
+ "title": "NGRAM_SEARCH",
+ "language": "en"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## Description
+
+Calculate the N-gram similarity between `text` and `pattern`. The similarity ranges from 0 to 1, where a higher similarity indicates greater similarity between the two strings.
+
+Both `pattern` and `gram_num` must be constants. If the length of either `text` or `pattern` is less than `gram_num`, return 0.
+
+N-gram similarity is a method for calculating text similarity based on N-grams. An N-gram is a set of continuous N characters or words extracted from a text string. For example, for the string "text" with N=2 (bigram), the bigrams are: {"te", "ex", "xt"}.
+
+The N-gram similarity is calculated as:
+
+2 * |Intersection| / (|text set| + |pattern set|)
+
+where |text set| and |pattern set| are the N-grams of `text` and `pattern`, and `Intersection` is the intersection of the two sets.
+
+Note that, by definition, a similarity of 1 does not necessarily mean the two strings are identical.
+
+Only supports ASCII encoding.
+
+## Syntax
+
+`DOUBLE ngram_search(VARCHAR text,VARCHAR pattern,INT gram_num)`
+
+## Example
+
+```sql
+mysql> select ngram_search('123456789' , '12345' , 3);
++---------------------------------------+
+| ngram_search('123456789', '12345', 3) |
++---------------------------------------+
+| 0.6 |
++---------------------------------------+
+
+mysql> select ngram_search("abababab","babababa",2);
++-----------------------------------------+
+| ngram_search('abababab', 'babababa', 2) |
++-----------------------------------------+
+| 1 |
++-----------------------------------------+
+```
+## keywords
+ NGRAM_SEARCH,NGRAM,SEARCH
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/string-functions/ngram-search.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/string-functions/ngram-search.md
new file mode 100644
index 0000000..1a2eecc
--- /dev/null
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/string-functions/ngram-search.md
@@ -0,0 +1,67 @@
+---
+{
+ "title": "NGRAM_SEARCH",
+ "language": "zh-CN"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## Description
+
+`DOUBLE ngram_search(VARCHAR text,VARCHAR pattern,INT gram_num)`
+
+计算 text 和 pattern 的 N-gram 相似度。相似度从 0 到 1,相似度越高证明两个字符串越相似。
+其中`pattern`,`gram_num`必须为常量。
+如果`text`或者`pattern`的长度小于`gram_num`,返回 0。
+
+N-gram 相似度(N-gram similarity)是一种基于 N-gram(N 元语法)的文本相似度计算方法。N-gram 是指将一个文本串分成连续的 N 个字符或词语的集合。例如,对于字符串“text”,当 N=2 时,其二元组(bi-gram)为:{“te”, “ex”, “xt”}。
+
+N-gram 相似度的计算为 2 * |Intersection| / (|text set| + |pattern set|)
+
+其中|text set|,|pattern set|为 text 和 pattern 的 N-gram,`Intersection`为两个集合的交集。
+
+注意,根据定义,相似度为 1 不代表两个字符串相同。
+
+仅支持 ASCII 编码。
+
+## Syntax
+
+`DOUBLE ngram_search(VARCHAR text,VARCHAR pattern,INT gram_num)`
+
+## Example
+
+```sql
+mysql> select ngram_search('123456789' , '12345' , 3);
++---------------------------------------+
+| ngram_search('123456789', '12345', 3) |
++---------------------------------------+
+| 0.6 |
++---------------------------------------+
+
+mysql> select ngram_search("abababab","babababa",2);
++-----------------------------------------+
+| ngram_search('abababab', 'babababa', 2) |
++-----------------------------------------+
+| 1 |
++-----------------------------------------+
+```
+## keywords
+ NGRAM_SEARCH,NGRAM,SEARCH
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-functions/string-functions/ngram-search.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-functions/string-functions/ngram-search.md
new file mode 100644
index 0000000..e080165
--- /dev/null
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-functions/string-functions/ngram-search.md
@@ -0,0 +1,65 @@
+---
+{
+ "title": "NGRAM_SEARCH",
+ "language": "zh-CN"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## Description
+
+计算 text 和 pattern 的 N-gram 相似度。相似度从 0 到 1,相似度越高证明两个字符串越相似。
+其中`pattern`,`gram_num`必须为常量。
+如果`text`或者`pattern`的长度小于`gram_num`,返回 0。
+
+N-gram 相似度(N-gram similarity)是一种基于 N-gram(N 元语法)的文本相似度计算方法。N-gram 是指将一个文本串分成连续的 N 个字符或词语的集合。例如,对于字符串“text”,当 N=2 时,其二元组(bi-gram)为:{“te”, “ex”, “xt”}。
+
+N-gram 相似度的计算为 2 * |Intersection| / (|text set| + |pattern set|)
+
+其中|text set|,|pattern set|为 text 和 pattern 的 N-gram,`Intersection`为两个集合的交集。
+
+注意,根据定义,相似度为 1 不代表两个字符串相同。
+
+仅支持 ASCII 编码。
+
+## Syntax
+
+`DOUBLE ngram_search(VARCHAR text,VARCHAR pattern,INT gram_num)`
+
+## Example
+
+```sql
+mysql> select ngram_search('123456789' , '12345' , 3);
++---------------------------------------+
+| ngram_search('123456789', '12345', 3) |
++---------------------------------------+
+| 0.6 |
++---------------------------------------+
+
+mysql> select ngram_search("abababab","babababa",2);
++-----------------------------------------+
+| ngram_search('abababab', 'babababa', 2) |
++-----------------------------------------+
+| 1 |
++-----------------------------------------+
+```
+## keywords
+ NGRAM_SEARCH,NGRAM,SEARCH
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-functions/string-functions/ngram-search.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-functions/string-functions/ngram-search.md
new file mode 100644
index 0000000..e080165
--- /dev/null
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-functions/string-functions/ngram-search.md
@@ -0,0 +1,65 @@
+---
+{
+ "title": "NGRAM_SEARCH",
+ "language": "zh-CN"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## Description
+
+计算 text 和 pattern 的 N-gram 相似度。相似度从 0 到 1,相似度越高证明两个字符串越相似。
+其中`pattern`,`gram_num`必须为常量。
+如果`text`或者`pattern`的长度小于`gram_num`,返回 0。
+
+N-gram 相似度(N-gram similarity)是一种基于 N-gram(N 元语法)的文本相似度计算方法。N-gram 是指将一个文本串分成连续的 N 个字符或词语的集合。例如,对于字符串“text”,当 N=2 时,其二元组(bi-gram)为:{“te”, “ex”, “xt”}。
+
+N-gram 相似度的计算为 2 * |Intersection| / (|text set| + |pattern set|)
+
+其中|text set|,|pattern set|为 text 和 pattern 的 N-gram,`Intersection`为两个集合的交集。
+
+注意,根据定义,相似度为 1 不代表两个字符串相同。
+
+仅支持 ASCII 编码。
+
+## Syntax
+
+`DOUBLE ngram_search(VARCHAR text,VARCHAR pattern,INT gram_num)`
+
+## Example
+
+```sql
+mysql> select ngram_search('123456789' , '12345' , 3);
++---------------------------------------+
+| ngram_search('123456789', '12345', 3) |
++---------------------------------------+
+| 0.6 |
++---------------------------------------+
+
+mysql> select ngram_search("abababab","babababa",2);
++-----------------------------------------+
+| ngram_search('abababab', 'babababa', 2) |
++-----------------------------------------+
+| 1 |
++-----------------------------------------+
+```
+## keywords
+ NGRAM_SEARCH,NGRAM,SEARCH
diff --git a/sidebars.json b/sidebars.json
index 4f7831a..fc9c971 100644
--- a/sidebars.json
+++ b/sidebars.json
@@ -895,6 +895,7 @@
"sql-manual/sql-functions/string-functions/split-by-regexp",
"sql-manual/sql-functions/string-functions/substring-index",
"sql-manual/sql-functions/string-functions/money-format",
+ "sql-manual/sql-functions/string-functions/ngram-search",
"sql-manual/sql-functions/string-functions/parse-url",
"sql-manual/sql-functions/string-functions/quote",
"sql-manual/sql-functions/string-functions/url-decode",
diff --git a/versioned_docs/version-2.1/sql-manual/sql-functions/string-functions/ngram-search.md b/versioned_docs/version-2.1/sql-manual/sql-functions/string-functions/ngram-search.md
new file mode 100644
index 0000000..a39c0f6
--- /dev/null
+++ b/versioned_docs/version-2.1/sql-manual/sql-functions/string-functions/ngram-search.md
@@ -0,0 +1,69 @@
+---
+{
+ "title": "NGRAM_SEARCH",
+ "language": "en"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## Description
+
+`DOUBLE ngram_search(VARCHAR text,VARCHAR pattern,INT gram_num)`
+
+Calculate the N-gram similarity between `text` and `pattern`. The similarity ranges from 0 to 1, where a higher similarity indicates greater similarity between the two strings.
+
+Both `pattern` and `gram_num` must be constants. If the length of either `text` or `pattern` is less than `gram_num`, return 0.
+
+N-gram similarity is a method for calculating text similarity based on N-grams. An N-gram is a set of continuous N characters or words extracted from a text string. For example, for the string "text" with N=2 (bigram), the bigrams are: {"te", "ex", "xt"}.
+
+The N-gram similarity is calculated as:
+
+2 * |Intersection| / (|text set| + |pattern set|)
+
+where |text set| and |pattern set| are the N-grams of `text` and `pattern`, and `Intersection` is the intersection of the two sets.
+
+Note that, by definition, a similarity of 1 does not necessarily mean the two strings are identical.
+
+Only supports ASCII encoding.
+
+## Syntax
+
+`DOUBLE ngram_search(VARCHAR text,VARCHAR pattern,INT gram_num)`
+
+## Example
+
+```sql
+mysql> select ngram_search('123456789' , '12345' , 3);
++---------------------------------------+
+| ngram_search('123456789', '12345', 3) |
++---------------------------------------+
+| 0.6 |
++---------------------------------------+
+
+mysql> select ngram_search("abababab","babababa",2);
++-----------------------------------------+
+| ngram_search('abababab', 'babababa', 2) |
++-----------------------------------------+
+| 1 |
++-----------------------------------------+
+```
+## keywords
+ NGRAM_SEARCH,NGRAM,SEARCH
diff --git a/versioned_docs/version-3.0/sql-manual/sql-functions/string-functions/ngram-search.md b/versioned_docs/version-3.0/sql-manual/sql-functions/string-functions/ngram-search.md
new file mode 100644
index 0000000..ae42731
--- /dev/null
+++ b/versioned_docs/version-3.0/sql-manual/sql-functions/string-functions/ngram-search.md
@@ -0,0 +1,67 @@
+---
+{
+ "title": "NGRAM_SEARCH",
+ "language": "en"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## Description
+
+Calculate the N-gram similarity between `text` and `pattern`. The similarity ranges from 0 to 1, where a higher similarity indicates greater similarity between the two strings.
+
+Both `pattern` and `gram_num` must be constants. If the length of either `text` or `pattern` is less than `gram_num`, return 0.
+
+N-gram similarity is a method for calculating text similarity based on N-grams. An N-gram is a set of continuous N characters or words extracted from a text string. For example, for the string "text" with N=2 (bigram), the bigrams are: {"te", "ex", "xt"}.
+
+The N-gram similarity is calculated as:
+
+2 * |Intersection| / (|text set| + |pattern set|)
+
+where |text set| and |pattern set| are the N-grams of `text` and `pattern`, and `Intersection` is the intersection of the two sets.
+
+Note that, by definition, a similarity of 1 does not necessarily mean the two strings are identical.
+
+Only supports ASCII encoding.
+
+## Syntax
+
+`DOUBLE ngram_search(VARCHAR text,VARCHAR pattern,INT gram_num)`
+
+## Example
+
+```sql
+mysql> select ngram_search('123456789' , '12345' , 3);
++---------------------------------------+
+| ngram_search('123456789', '12345', 3) |
++---------------------------------------+
+| 0.6 |
++---------------------------------------+
+
+mysql> select ngram_search("abababab","babababa",2);
++-----------------------------------------+
+| ngram_search('abababab', 'babababa', 2) |
++-----------------------------------------+
+| 1 |
++-----------------------------------------+
+```
+## keywords
+ NGRAM_SEARCH,NGRAM,SEARCH
diff --git a/versioned_sidebars/version-2.1-sidebars.json b/versioned_sidebars/version-2.1-sidebars.json
index 0746448..7572a6b 100644
--- a/versioned_sidebars/version-2.1-sidebars.json
+++ b/versioned_sidebars/version-2.1-sidebars.json
@@ -840,6 +840,7 @@
"sql-manual/sql-functions/string-functions/split-by-string",
"sql-manual/sql-functions/string-functions/substring-index",
"sql-manual/sql-functions/string-functions/money-format",
+ "sql-manual/sql-functions/string-functions/ngram-search",
"sql-manual/sql-functions/string-functions/parse-url",
"sql-manual/sql-functions/string-functions/quote",
"sql-manual/sql-functions/string-functions/url-decode",
diff --git a/versioned_sidebars/version-3.0-sidebars.json b/versioned_sidebars/version-3.0-sidebars.json
index f82c6e8..d7a7efc 100644
--- a/versioned_sidebars/version-3.0-sidebars.json
+++ b/versioned_sidebars/version-3.0-sidebars.json
@@ -885,6 +885,7 @@
"sql-manual/sql-functions/string-functions/split-by-string",
"sql-manual/sql-functions/string-functions/substring-index",
"sql-manual/sql-functions/string-functions/money-format",
+ "sql-manual/sql-functions/string-functions/ngram-search",
"sql-manual/sql-functions/string-functions/parse-url",
"sql-manual/sql-functions/string-functions/quote",
"sql-manual/sql-functions/string-functions/url-decode",