SQL databases use two entirely different group by algorithms. The first one, the hash algorithm, aggregates the input records in a temporary hash table. Once all input records are processed, the hash table is returned as the result. The second algorithm, the sort/group algorithm, first sorts the input data by the grouping key so that the rows of each group follow each other in immediate succession. Afterwards, the database just needs to aggregate them. In general, both algorithms need to materialize an intermediate state, so they are not executed in a pipelined manner. Nevertheless the sort/group algorithm can use an index to avoid the sort operation, thus enabling a pipelined group by.
“对索引性能有很大影响”是指什么?索引的时间太久了?但这似乎又和
gruop by
没什么关系。所以我猜你的问题是不是“索引是不是能提升group by的性能”?这个问题的因果关系好想更容易理解些,那如果是这个问题的话,可能下面这段话能给你一些提示:
原文出处:Indexing Group By