59 Star 535 Fork 88

GVP万里数据库 / GreatSQL

 / 详情

8.0.32-25 版本并行查询问题

已确认
缺陷
创建于  
2024-03-13 15:59

【 使用版本 】
8.0.32-25 版本

【 使用环境 】
测试环境

【问题描述】
开启并行查询时, 一条简单sql带有orderby主键 limit较大时候的会比不开并行查询慢3-6倍

【复现路径】
默认开启并行查询,
创建一张500w数据的表, id为主键

select * from order_dev ORDER BY id asc limit 1500000,1000;
select /*+ NO_PQ */ * from order_dev ORDER BY id asc limit 1500000,1000;

第一条sql执行20.759秒
第二条sql执行3.392秒

评论 (26)

阿旗met 创建了缺陷
GreatSQL 任务状态待确认 修改为已确认
展开全部操作日志

好的,我试一下

看了一下Rapid引擎还处于Alpha版本阶段,尚未达到GA(General Availability)阶段. 还是不敢上生产, 那并行查询这个问题会先处理吗?

有几个问题请帮忙确认下:

  1. 请问这两个结果耗时是多次测试都这样吗
  2. 请分别提供两个SQL的执行计划

GreatSQL 8.0.32-25最新版本发布了Rapid引擎,推荐您使用Rapid引擎加速AP类型查询。详见:https://greatsql.cn/docs/8032-25/user-manual/5-enhance/5-1-highperf-rapid-engine.html

回答:
1: 是的, 多次测试都是一样, 并行查询的带有limit一个较大的数是, 一直很慢.
输入图片说明
输入图片说明
2:
并行查询的执行计划
EXPLAIN format=tree select * from order_dev ORDER BY id asc limit 1500000,1000;
-> Limit/Offset: 1000/1500000 row(s)
-> Parallel scan on
-> Limit: 1501000 row(s) (cost=187605.08 rows=1501000)
-> PQblock scan on order_dev using PRIMARY (cost=187605.08 rows=1501000)

未用并行查询的执行计划
EXPLAIN format=tree select /*+ NO_PQ */ * from order_dev ORDER BY id asc limit 1500000,1000;
-> Limit/Offset: 1000/1500000 row(s) (cost=187605.08 rows=1000)
-> Index scan on order_dev using PRIMARY (cost=187605.08 rows=1501000)

经过测试, 多表join然后带个limit较大的数在并行查询也会出现这个比正常查询慢几倍的问题, 可能是带有limit的sql在并行查询下都会出现这个问题

执行计划改用这个方式查看
EXPLAIN ANALYZE select ...
这样能同时真正执行一次,并打印出真实数据

并行查询:
-> Limit/Offset: 1000/1500000 row(s) (actual time=18355.660..18366.767 rows=1000 loops=1)
-> Parallel scan on (actual time=3.655..18331.681 rows=1501000 loops=1)
-> Limit: 1501000 row(s) (cost=185794.68 rows=1501000) (actual time=0.016..1376.907 rows=510624 loops=1)
-> PQblock scan on order_dev using PRIMARY (cost=185794.68 rows=1501000) (actual time=0.015..1353.090 rows=510624 loops=1)

未用并行查询:
-> Limit/Offset: 1000/1500000 row(s) (cost=185865.23 rows=1000) (actual time=3085.989..3088.142 rows=1000 loops=1)
-> Index scan on order_dev using PRIMARY (cost=185865.23 rows=1501000) (actual time=0.153..3053.278 rows=1501000 loops=1)

看起来应该是个bug,感谢您的反馈
并再次建议您尝试下我们新推出的Rapid引擎哈

看到GreatSQL开发了新的Rapid引擎,我正准备试一下,
但是我准备把我们的MySQL升级成GreatSQL-8.0.32-25时,但遭遇了如下错误,请问怎么解决。
我们的mysql版本是mysql 8.0.36

 /usr/local/GreatSQL-8.0.32-25/bin/mysqld --defaults-file=/etc/mysql/9004.cnf   --user=mysql &
[1] 15707
[root@yingji-db04.produce.yz ~]# 2024-03-19T07:14:58.433496Z 0 [Warning] [MY-010097] [Server] Insecure configuration for --secure-file-priv: Current value does not restrict location of generated files. Consider setting it to a valid, non-empty path.
2024-03-19T07:14:58.433503Z 0 [Warning] [MY-010097] [Server] Insecure configuration for --secure-log-path: Current value does not restrict location of generated files. Consider setting it to a valid, non-empty path.
2024-03-19T07:14:58.433631Z 0 [System] [MY-010116] [Server] /usr/local/GreatSQL-8.0.32-25/bin/mysqld (mysqld 8.0.32-25) starting as process 15707
2024-03-19T07:14:58.445817Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2024-03-19T07:14:59.597040Z 1 [ERROR] [MY-013171] [InnoDB] Cannot boot server version 80032 on data directory built by version 80036. Downgrade is not supported
mysqld: Can't open file: 'mysql.ibd' (errno: 0 - )
2024-03-19T07:15:00.274727Z 1 [ERROR] [MY-010334] [Server] Failed to initialize DD Storage Engine
2024-03-19T07:15:00.275416Z 0 [ERROR] [MY-010020] [Server] Data Dictionary initialization failed.
2024-03-19T07:15:00.275457Z 0 [ERROR] [MY-010119] [Server] Aborting
2024-03-19T07:15:00.276532Z 0 [System] [MY-010910] [Server] /usr/local/GreatSQL-8.0.32-25/bin/mysqld: Shutdown complete (mysqld 8.0.32-25)  GreatSQL, Release 25, Revision 79f57097e3f.

MySQL 8.0不支持原地降级启动,需要用逻辑备份方式导出数据后,再在GreatSQL 8.0.32-25版本中导入,参考用户手册:https://greatsql.cn/docs/8032-25/user-manual/7-migrate-and-upgrade/2-migrate-from-mysql-togreatsql.html

show create table x;
请用这种方式提供下表DDL

知道怎么回事了,是bit类型不支持导致的
IsUse bit(1) NOT NULL DEFAULT b'1' COMMENT '是否启用',

我现在已经升级为GreatSQL 8.0.32-25了,但是有个表遇到如下错误,我看数据类型应该是支持的呀,为啥还报错:

mysql> ALTER TABLE BIZ_Enterprise                 SECONDARY_LOAD;
ERROR 3877 (HY000): The data type is not yet supported

mysql>  desc BIZ_Enterprise;
+--------------------+---------------+------+-----+-------------------+-----------------------------------------------+
| Field              | Type          | Null | Key | Default           | Extra                                         |
+--------------------+---------------+------+-----+-------------------+-----------------------------------------------+
| RecordID           | int           | NO   | PRI | NULL              | auto_increment                                |
| Guid               | varchar(36)   | NO   |     | NULL              |                                               |
| EnterpriseID       | bigint        | NO   | UNI | NULL              |                                               |
| EnterpriseName     | varchar(256)  | YES  |     | NULL              |                                               |
| EnterpriseFullName | varchar(256)  | YES  |     | NULL              |                                               |
| PID                | bigint        | YES  | MUL | 0                 |                                               |
| Level              | int           | YES  |     | 1                 |                                               |
| AreaID             | bigint        | YES  |     | NULL              |                                               |
| BeginDate          | date          | YES  |     | NULL              |                                               |
| Remarks            | varchar(1024) | YES  |     | NULL              |                                               |
| IsUse              | bit(1)        | NO   |     | b'1'              |                                               |
| OwnerID            | bigint        | NO   |     | NULL              |                                               |
| CreatedDate        | datetime      | NO   |     | CURRENT_TIMESTAMP | DEFAULT_GENERATED                             |
| ModifiedDate       | timestamp     | NO   |     | CURRENT_TIMESTAMP | DEFAULT_GENERATED on update CURRENT_TIMESTAMP |
+--------------------+---------------+------+-----+-------------------+-----------------------------------------------+
14 rows in set (0.01 sec)


对于表中有不支持的数据类型时,就没办法使用Rapid存储引擎了吧
我发现我们好多表都用了bit类型,看来暂时还用不了Rapid。
希望能支持bit类型。
输入图片说明

您可以先把bit改成int看能否满足业务需求
我们也争取在未来的版本中尽快支持bit类型,再次感谢您的反馈

好的,我将bit改成int后,我的sql又遇到新的问题了,这是sql_mode控制的吗,但是我已经将sql_mode里面的only_full_group去掉了:

SELECT a.CustomerID
	, SUM(CASE 
		WHEN a.DataDate >= '2023-12-01' THEN b.Amount - b.AmountTotal
		ELSE 0
	END) AS XCDiscount
	, SUM(CASE 
		WHEN a.DataDate < '2023-12-01'
			AND a.DataDate >= qcdate.DataDate
		THEN AmountTotal
		ELSE 0
	END) AS XSFS_QC
	, SUM(CASE 
		WHEN a.DataDate >= '2023-12-01' THEN t3.yfAmount
		ELSE 0
	END) AS yfAmount
	, SUM(CASE 
		WHEN a.DataDate >= '2023-12-01' THEN Amount
		ELSE 0
	END) AS Amount
	, SUM(CASE 
		WHEN a.DataDate < '2023-12-01'
			AND a.DataDate >= qcdate.DataDate
		THEN t3.yfAmount
		ELSE 0
	END) AS Qmyf
	, cus.CustomerName AS SummaryType1Name, cus.CustomerID AS SummaryType1, '客户' AS SummaryType1FieldName
FROM test_Qlw_Business.SA_Sales a
	INNER JOIN test_Qlw_Business.SA_SalesDetail_czx b ON a.NumericalOrder = b.NumericalOrder
	LEFT JOIN (
		SELECT c.EnterpriseID, MAX(IFNULL(a.DataDate, '1990-01-01')) AS DataDate
		FROM qlw_test_com.BIZ_Enterprise_czx c
			LEFT JOIN test_Qlw_Business.FD_SettleReceipt a
			ON a.EnterpriseID = c.EnterpriseID
				AND SettleReceipType = 201610220104402204
				AND a.datadate <= '2023-12-01'
		WHERE c.enterpriseid = 1132506
		GROUP BY c.enterpriseid
	) qcdate
	ON a.enterpriseid = qcdate.enterpriseid
	LEFT JOIN (
		SELECT t1.EnterpriseID AS EnterpriseID, t1.DataDate AS DataDate, t3.ParentValue, Avg(t2.AmountTotal) AS yfAmount
		FROM test_qlw_business.ls_carriage t1
			INNER JOIN test_qlw_business.ls_carriagedetail t2 ON t1.NumericalOrder = t2.NumericalOrder
			INNER JOIN test_qlw_business.biz_related t3
			ON RelatedType = 201610210104402122
				AND ParentType = 1805081858270000130
				AND ChildType = 1612230944450000101
				AND t2.NumericalOrderDetail = t3.ChildValue
		WHERE t2.AmountTotal IS NOT NULL
			AND t1.EnterpriseID = 1132506
			AND t1.DataDate <= '2023-12-31'
		GROUP BY EnterpriseID, DataDate, ParentValue
	) t3
	ON t3.ParentValue = a.NumericalOrder
		AND t3.EnterpriseID = a.EnterpriseID
	LEFT JOIN qlw_test_com.biz_customer_czx cus ON a.CustomerID = cus.CustomerID
WHERE a.DataDate <= '2023-12-31'
	AND a.EnterpriseID = 1132506
GROUP BY a.CustomerID, SummaryType1


ERROR 3877 (HY000): Binder Error: column "CustomerName" must appear in the GROUP BY clause or must be part of an aggregate function.
Either add it to the GROUP BY list, or use "ANY_VALUE(CustomerName)" if the exact value of "CustomerName" is not important.



mysql> show variables like '%sql_mode%'
    -> ;
+-----------------+----------------------------------------------------------------------------------------------------+
| Variable_name   | Value                                                                                              |
+-----------------+----------------------------------------------------------------------------------------------------+
| shrink_sql_mode |                                                                                                    |
| sql_mode        | STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION |
+-----------------+----------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

修改完sql_mode后,只对上层优化器生效
rapid引擎还有一层优化器,没被sql_mode影响到
不过我也认为这算是个"问题",会反馈给研发同学

我还发现个问题,在greatsql里面,用Rapid查询和不用Rapid查询,得到的结果有一列数据结果不一样:
输入图片说明

输入图片说明

那个sql用的列sql是:

SUM(CASE 
		WHEN a.DataDate < '2023-12-01'
			AND a.DataDate >= qcdate.DataDate
		THEN t3.yfAmount
		ELSE 0
	END) AS Qmyf

最终结果集可能是一样的,只不过需要加ORDER BY + LIMIT之类的约束来控制,详见手册其中的一条:

  • 由于底层存储结构的差异,用户从InnoDB主引擎和Rapid辅助引擎分别读取数据时,如果不加相同的排序规则,则读取到的数据顺序可能不一致。

手册:https://greatsql.cn/docs/8032-25/user-manual/5-enhance/5-1-highperf-rapid-engine.html#_8-%E6%B3%A8%E6%84%8F%E4%BA%8B%E9%A1%B9

对了,你说的需要加ORDER BY + LIMIT之类的约束来控制,和不加相同的排序规则,则读取到的数据顺序可能不一致,这两句话,我没明白。

对于我这个sql,应该怎么改,才能保证Qmyf列的数据,和greatsql读取的数据是一致的。

例如原始SQL是:select * from t1 limit 3;
查询innodb引擎时,会根据聚集索引顺序读取前3条记录,和下面的效果一样:
select * from t1 order by pkid limit 3;

而rapid引擎则可能不是这个顺序,此时需要修改成:
select * from t1 order by pkid limit 3;

这样才能保证结果和原来一样。

如果原始SQL是:select * from t1; 则在不同引擎中读取到的结果集全部数据是一样的,只是排序顺序不一样而已。

好的,我试过tidb和clickhouse,一直没用起来。
tidb的问题是,有些sql语句在mysql是走索引的,速度还行,放到tidb就走全表扫描,根本查不出来,研究很长时间也没解决放弃了;
clickhouse的问题是,mysql向ck的数据同步问题,不知怎么做好;

目前看到了greatsql的rapid存储引擎,虽然目前还是有些数据类型的限制,但是我感觉挺值得期待的,后期会持续关注。

上面的sql语句,用greatesql查询需要6s,用rapid后变成600ms,确实不错,和clickhouse测得的结果一致。

希望能早日支持bit类型,要不我们暂时还没法用

更多问题请来我们社区论坛发帖交流吧 https://greatsql.cn/forum.php

这个其实是sql_mode中的NO_ZERO_IN_DATE,NO_ZERO_DATE引起的,我去掉了mysql层面的sql_mode限制,却不能影响rapid的sql_mode,这个建议也增强一下。
请问,我现在有什么其他办法可以临时解决一下吗

mysql> alter table biz_xxx                   SECONDARY_LOAD;
ERROR 3877 (HY000): An error occurred while process value, Conversion Error: Date out of range: 0-0-0
mysql> show variables like '%sql_mode%' ;
+-----------------+----------------------------------------------------------------------------------------------------+
| Variable_name   | Value                                                                                              |
+-----------------+----------------------------------------------------------------------------------------------------+
| shrink_sql_mode |                                                                                                    |
| sql_mode        | STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION |
+-----------------+----------------------------------------------------------------------------------------------------+
2 rows in set (0.01 sec)

mysql> set sql_mode='STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> alter table biz_xxx                    SECONDARY_LOAD;
ERROR 3877 (HY000): An error occurred while process value, Conversion Error: Date out of range: 0-0-0
mysql> 

目前rapid引擎还没办法从server层继承sql_mode设置,这个问题已经给研发同学提需求了

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(4)
8779455 greatsql 1615290631 1204471 methinking 1578946100
C++
1
https://gitee.com/GreatSQL/GreatSQL.git
git@gitee.com:GreatSQL/GreatSQL.git
GreatSQL
GreatSQL
GreatSQL

搜索帮助