SQL Server中的哪些对象会占用磁盘空间? 看到标题的第一瞬间,让我想到的就是这个问题。下面我们就试着来讲一讲这个问题. 第一个磁盘空间使用大头肯定想到就是表。表只是一个逻辑对象,又没有想过表这个逻辑对象是怎么在磁盘上存储的呢? 《数据库系统实现原
SQL Server中的哪些对象会占用磁盘空间? 看到标题的第一瞬间,让我想到的就是这个问题。下面我们就试着来讲一讲这个问题.
第一个磁盘空间使用大头肯定想到就是表。表只是一个逻辑对象,又没有想过表这个逻辑对象是怎么在磁盘上存储的呢? 《数据库系统实现原理》或者叫做《Database System implementation》一书中对表的存储方式应该有更详尽的描述。我们只讨论SQL SERVER的实现,所以不扯那么远。
SQL SERVER的空间分配,大的层面上来说,有file group, data file, log file之分。File group是逻辑上对data file和log file做分类。假设我们要新建一个database, 叫做lenistest。这个database 我们要分别将data file和log file归类到不同的file group里面,方便管理与维护。主要区别的是 primary file group和secondary file group,也就是 .mdf和.ndf的区别。
<code class=" hljs sql"><span class="hljs-operator"><span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">DATABASE</span> [lenistest5] <span class="hljs-keyword">ON</span> <span class="hljs-keyword">PRIMARY</span> ( NAME = N<span class="hljs-string">'lenistest5'</span>, FILENAME = N<span class="hljs-string">'E:\Data_BU\lenistest5.mdf'</span> , <span class="hljs-keyword">SIZE</span> = <span class="hljs-number">10240</span>KB , MAXSIZE = <span class="hljs-number">102400</span>KB , FILEGROWTH = <span class="hljs-number">1024</span>KB ) , filegroup maindatagroup ( NAME = N<span class="hljs-string">'lenistest5_data01'</span>, FILENAME = N<span class="hljs-string">'E:\Data_BU\lenistest5_data01.ndf'</span> , <span class="hljs-keyword">SIZE</span> = <span class="hljs-number">10240</span>KB , MAXSIZE = <span class="hljs-number">102400</span>KB , FILEGROWTH = <span class="hljs-number">1024</span>KB ) , filegroup backupdatafg ( NAME = N<span class="hljs-string">'lenistest5_bk_data01'</span>, FILENAME = N<span class="hljs-string">'E:\Data_BU\lenistest5_bk_data01.ndf'</span> , <span class="hljs-keyword">SIZE</span> = <span class="hljs-number">10240</span>KB , MAXSIZE = <span class="hljs-number">10240</span>KB , FILEGROWTH = <span class="hljs-number">1024</span>KB ) LOG <span class="hljs-keyword">ON</span> ( NAME = N<span class="hljs-string">'lenistest5_log'</span>, FILENAME = N<span class="hljs-string">'E:\Data_BU\lenistest5_log.ldf'</span> , <span class="hljs-keyword">SIZE</span> = <span class="hljs-number">10240</span>KB , MAXSIZE = <span class="hljs-number">10240</span>KB , FILEGROWTH = <span class="hljs-number">1024</span>KB ) <span class="hljs-keyword">GO</span></span></code>
用上面的这个SQL我们可以创建一个具有3个data file group, 和1个log file group的数据库 lenistest5 。.mdf全局唯一 ,不能有多个.mdf文件,但是可以有多个.ndf文件。我们是不是可以看到.mdf到底存储了什么?
<code class=" hljs cs"><span class="hljs-keyword">select</span> name ,recovery_model_desc ,is_auto_create_stats_on ,is_auto_create_stats_incremental_on ,is_auto_update_stats_on ,is_auto_update_stats_async_on <span class="hljs-keyword">from</span> sys.databases <span class="hljs-keyword">where</span> name = <span class="hljs-string">'lenistest5'</span></code>
这里可以看到刚创建的数据库有怎么样的恢复计划,这直接影响了日志的存储,还有统计信息更新计划,同样也会影响存储,更会影响执行计划的优劣,所以这也是需要创建数据后核实的。
<code class=" hljs sql"><span class="hljs-operator"><span class="hljs-keyword">select</span> name <span class="hljs-keyword">as</span> FileGroupName ,data_space_id ,type_desc ,is_default <span class="hljs-keyword">from</span> sys.filegroups <span class="hljs-keyword">select</span> type_desc ,data_space_id ,name ,physical_name ,state_desc ,<span class="hljs-keyword">size</span> * <span class="hljs-number">8</span> /<span class="hljs-number">1024</span> <span class="hljs-keyword">as</span> size_mb ,max_size * <span class="hljs-number">8</span> /<span class="hljs-number">1024</span> <span class="hljs-keyword">as</span> max_size_mb <span class="hljs-keyword">from</span> sys.database_files</span></code>
sys.filegroups, sys.database_files是归属于特定数据库的,所以运行的时候需要切换到特定的数据库底下。不象有些DMV是全局性的,不需要指定数据库,在任何数据库根目录下,都能查到一致性的数据,比如 sys.dm_tran_locks.
Is_default这里需要特别指出来 ,使因为如果在create table之后没有指定特别的file group,默认这个表就是存在这个file group之下。如果要更改这个default file group,我们可以这么做:
<code class=" hljs sql"><span class="hljs-operator"><span class="hljs-keyword">alter</span> <span class="hljs-keyword">database</span> lenistest5 modify filegroup maindatagroup <span class="hljs-keyword">default</span></span></code>
Size, max_size是以PAGE为单位来计算的。一个page的存储大小为8KB ,所以计算起来就是乘以8 ,再除以1024换成MB。
<code class=" hljs sql"><span class="hljs-operator"><span class="hljs-keyword">select</span> isnull(g.FileGroupName,<span class="hljs-string">'LOG File Group'</span>) <span class="hljs-keyword">as</span> FileGroupName , isnull(g.type_desc,<span class="hljs-string">'LOG FILE GROUP'</span>) <span class="hljs-keyword">as</span> Filegroup_type_description , isnull(g.is_default,<span class="hljs-number">0</span>) <span class="hljs-keyword">as</span> DefaultFileGroup , f.type_desc <span class="hljs-keyword">as</span> datafile_type_description , f.name <span class="hljs-keyword">as</span> fileName , f.physical_name <span class="hljs-keyword">as</span> file_physical_name , f.state_desc <span class="hljs-keyword">as</span> datafilestatus , f.size_mb <span class="hljs-keyword">as</span> datafile_size_mb , f.max_size_mb <span class="hljs-keyword">as</span> datafile_max_size_mb <span class="hljs-keyword">from</span> ( <span class="hljs-keyword">select</span> name <span class="hljs-keyword">as</span> FileGroupName ,data_space_id ,type_desc ,is_default <span class="hljs-keyword">from</span> sys.filegroups ) g <span class="hljs-keyword">right</span> <span class="hljs-keyword">outer</span> <span class="hljs-keyword">join</span> ( <span class="hljs-keyword">select</span> type_desc ,data_space_id ,name ,physical_name ,state_desc ,<span class="hljs-keyword">size</span> * <span class="hljs-number">8</span> /<span class="hljs-number">1024</span> <span class="hljs-keyword">as</span> size_mb ,max_size * <span class="hljs-number">8</span> /<span class="hljs-number">1024</span> <span class="hljs-keyword">as</span> max_size_mb <span class="hljs-keyword">from</span> sys.database_files ) f <span class="hljs-keyword">on</span> g.data_space_id = f.data_space_id <span class="hljs-keyword">order</span> <span class="hljs-keyword">by</span> f.data_space_id <span class="hljs-keyword">asc</span></span></code>
将 Filegroup 包含的所有 data file归纳起来,包括日志文件 。日志文件没有filegroup.
我们看看当新建一个表的时候,表结构及数据的存储:
<code class=" hljs sql"><span class="hljs-operator"><span class="hljs-keyword">create</span> <span class="hljs-keyword">table</span> dbo.sales(transactionDate datetime, amont <span class="hljs-keyword">int</span>)</span></code>
看表数据存储需要借助 DBCC IND 和 DBCC PAGE. 默认情况下,我们执行这些 DBCC 命令, 输出文件不是我们的SSMS Console,所以需要将输出重定位,DBCC TraceOn(3604)可以帮我们把带输出的DBCC命令将结果输出到SSMS Console;DBCC TraceOn(3605)可以帮我们把带输出的DBCC命令将结果输出到SQL SERVER Error Log。这里我们选用DBCC TranceOn(3604). 命令的有效范围是当前session, 需要关掉的话用DBCC TraceOff(3604).
<code class=" hljs scss">DBCC <span class="hljs-function">TraceOn(<span class="hljs-number">3604</span>)</span> DBCC <span class="hljs-function">IND(lenistest5,<span class="hljs-string">'dbo.sales'</span>,<span class="hljs-number">0</span>)</span></code>
当表里没有数据的时候,DBCC IND 是没有数据的,所以只显示:
DBCC execution completed. If DBCC printed error messages, contact your
system administrator.
DBCC IND 的语法是:
DBCC IND ( {dbname}, {table_name},{index_id} )
Index_id为0的时候,表示取的是堆表的信息,其他数值,等同于sys.indexes.index_id.
返回结果所包含的列有:
PageFID: page file Id. 数据页所在的数据文件的地址。也就是sys.database_files.file_id 的值。
PagePID: page id
IAMFID: index allocation MAP file id. 等同 sys.database_files.file_id.
IAMPID: Index allocation MAP page id
PageType : 注明了这个page的用途 :
1 - Data page
2 - Index page
3 - Large object page
4 - Large object page
8 - Global Allocation Map page
9 - Share Global Allocation Map page
10 - Index Allocation Map page
11 - Page Free Space page
13 - Boot page
15 - File header page
16 - Differential Changed Map page
17 - Bulk Changed Map page
其他字段比较容易理解。
既然知道了这一个页,比如IAMPID, 那我们就可以知道这个页到底存了哪些东西,还可以比较IAM page 与普通page的异同。 甚至还可以比较GAM, IAM, SGAM的不同,这放以后讨论。现在我们的表里暂时只有一条数据,所以总共才2个page. 一个IAM page,一个data page. 真好用来做比较。要想看一个page的存储内容,DBCC PAGE就该上场了。用法如下:
DBCC PAGE( {dbid|dbname}, pagenum [,print option] [,cache] [,logical] )
也有的是这么介绍的,毕竟这是非官方支持的命令,所以都试试
<code class=" hljs mathematica">dbcc page ( <span class="hljs-list">{‘dbname’ | dbid}</span>, filenum, pagenum [, printopt=<span class="hljs-list">{0|1|2|3}</span> ])</code>
The filenum and pagenum parameters are taken from the page IDs that come from various system tables and appear in DBCC or other system error messages. A page ID of, say, (1:354) has filenum = 1 and pagenum = 354.
The printopt parameter has the following meanings:
0 – print just the page header
1 – page header plus per-row hex dumps and a dump of the page slot array (unless its a page that doesn’t have one, like allocation bitmaps)
2 – page header plus whole page hex dump
3 – page header plus detailed per-row interpretation
Filenum: 对应了DBCC IND结果集里的 pageFID, 数据文件的 ID
PAGENum:对应了 DBDD IND 结果集里的 pagePID, 数据页的 ID
PrintOpt:
0: page头文件信息
1: page头文件信息,加上每一行的16进制信息
2: page头文件信息,加上每一页的16进制信息
3: page头文件信息,加上详细的每一页的每一行的解释信息
似乎这里第二种写法比较靠谱:
DBCC PAGE (lenistest5, 3,9,3)
PAGE: (3:9)
BUFFER:
BUF @0x0000000484E524C0
bpage = 0x00000003F348C000 bhash = 0x0000000000000000 bpageno = (3:9)
bdbid = 35 breferences = 0 bcputicks = 0
bsampleCount = 0 bUse1 = 15680 bstat = 0xb
blog = 0x1212121c bnext = 0x0000000000000000
PAGE HEADER:
Page @0x00000003F348C000
m_pageId = (3:9) m_headerVersion = 1 m_type = 10
m_typeFlagBits = 0x0 m_level = 0 m_flagBits = 0x0
m_objId (AllocUnitId.idObj) = 120 m_indexId (AllocUnitId.idInd) = 256
Metadata: AllocUnitId = 72057594045792256
Metadata: PartitionId = 72057594040549376 Metadata: IndexId = 0
Metadata: ObjectId = 245575913 m_prevPage = (0:0) m_nextPage = (0:0)
pminlen = 90 m_slotCnt = 2 m_freeCnt = 6
m_freeData = 8182 m_reservedCnt = 0 m_lsn = (35:193:15)
m_xactReserved = 0 m_xdesId = (0:0) m_ghostRecCnt = 0
m_tornBits = 0 DB Frag ID = 1
Allocation Status
GAM (3:2) = ALLOCATED SGAM (3:3) = ALLOCATED
PFS (3:1) = 0x70 IAM_PG MIXED_EXT ALLOCATED 0_PCT_FULL DIFF (3:6) =
CHANGEDML (3:7) = NOT MIN_LOGGED
IAM: Header @0x0000000012DFA064 Slot 0, Offset 96
sequenceNumber = 0 status = 0x0 objectId = 0
indexId = 0 page_count = 0 start_pg = (3:0)
IAM: Single Page Allocations @0x0000000012DFA08E
Slot 0 = (3:8) Slot 1 = (0:0) Slot 2 = (0:0)
Slot 3 = (0:0) Slot 4 = (0:0) Slot 5 = (0:0)
Slot 6 = (0:0) Slot 7 = (0:0)
IAM: Extent Alloc Status Slot 1 @0x0000000012DFA0C2
(3:0) - (3:1272) = NOT ALLOCATED
DBCC execution completed. If DBCC printed error messages, contact your
system administrator.
有这么一行需要特别注意的:
IAM: Single Page Allocations @0x0000000012DFA08E
Slot 0 = (3:8)
这是说明IAM PAGE 这一页记录了他所能管辖的数据页的分配,slot 0 =(3:8). 8就代表了data page id =8 .
而下面这一行,代表的就是IAM PAGE所在的page id
Page @0x00000003F348C000
m_pageId = (3:9)
比较下data page 与 IAM Page 的不同:
DBCC PAGE (lenistest5, 3,8,3)
PAGE: (3:8)
BUFFER:
BUF @0x0000000484E53D80
bpage = 0x00000003F34AA000 bhash = 0x0000000000000000 bpageno = (3:8)
bdbid = 35 breferences = 0 bcputicks = 0
bsampleCount = 0 bUse1 = 16691 bstat = 0xb
blog = 0x212121cc bnext = 0x0000000000000000
PAGE HEADER:
Page @0x00000003F34AA000
m_pageId = (3:8) m_headerVersion = 1 m_type = 1
m_typeFlagBits = 0x0 m_level = 0 m_flagBits = 0x8000
m_objId (AllocUnitId.idObj) = 120 m_indexId (AllocUnitId.idInd) = 256
Metadata: AllocUnitId = 72057594045792256
Metadata: PartitionId = 72057594040549376 Metadata: IndexId = 0
Metadata: ObjectId = 245575913 m_prevPage = (0:0) m_nextPage = (0:0)
pminlen = 16 m_slotCnt = 1 m_freeCnt = 8075
m_freeData = 115 m_reservedCnt = 0 m_lsn = (35:193:28)
m_xactReserved = 0 m_xdesId = (0:0) m_ghostRecCnt = 0
m_tornBits = 0 DB Frag ID = 1
Allocation Status
GAM (3:2) = ALLOCATED SGAM (3:3) = ALLOCATED
PFS (3:1) = 0x61 MIXED_EXT ALLOCATED 50_PCT_FULL DIFF (3:6) = CHANGED
ML (3:7) = NOT MIN_LOGGED
Slot 0 Offset 0x60 Length 19
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record
Size = 19Memory Dump @0x000000001AF5A060
0000000000000000: 10001000 bb7d7701 10a60000 01000000 020000
….?}w..|………Slot 0 Column 1 Offset 0x4 Length 8 Length (physical) 8
transactionDate = 2016-05-24 22:47:07.290
Slot 0 Column 2 Offset 0xc Length 4 Length (physical) 4
amont = 1
这页存储的数据一目了然,而且数据类型,字节大小都明白的告诉我们了:
Slot 0 Column 1 Offset 0x4 Length 8 Length (physical) 8
transactionDate = 2016-05-24 22:47:07.290
Slot 0 Column 2 Offset 0xc Length 4 Length (physical) 4
amont = 1
到这里我们已经可以用脚本来归纳所有file group, data file,以及table ,index的对应关系了:利用 DBCC IND来获取整个数据库 表和索引的文件对应关系。还有一种方法,使用新增加的DMC来查询,这个DMV是 sys.dm_db_database_page_allocations.分清楚表和索引的存储关系,不仅仅是方便管理,更有利于性能的提高,表和索引分别存储在不同的硬盘驱动器上,有利于并行处理。
<code class=" hljs sql">use lenistest4 go declare @tablename varchar(200) declare @index_Id int declare @sqlstatement nvarchar(max) declare @databasename varchar(200) ='lenistest4' declare cur_tables cursor for (<span class="hljs-operator"><span class="hljs-keyword">select</span> schema_name(schema_id) +<span class="hljs-string">'.'</span>+name <span class="hljs-keyword">as</span> tableName <span class="hljs-keyword">from</span> sys.tables ) <span class="hljs-keyword">open</span> cur_tables <span class="hljs-keyword">fetch</span> <span class="hljs-keyword">next</span> <span class="hljs-keyword">from</span> cur_tables <span class="hljs-keyword">into</span> @tablename <span class="hljs-keyword">if</span> <span class="hljs-keyword">exists</span>( <span class="hljs-keyword">select</span> <span class="hljs-number">1</span> <span class="hljs-keyword">from</span> tempdb.sys.tables <span class="hljs-keyword">where</span> upper(name) <span class="hljs-keyword">like</span> upper(<span class="hljs-string">'%tempTabIndall%'</span>) ) <span class="hljs-keyword">drop</span> <span class="hljs-keyword">table</span> #tempTabIndall ;</span> <span class="hljs-operator"><span class="hljs-keyword">create</span> <span class="hljs-keyword">table</span> #tempTabIndall(PageFID bigint, PagePID bigint, IAMFID bigint, IAMPID bigint, ObjectID bigint, IndexId bigint, PartitionNumber bigint, PartitionID bigint, iam_chain_type <span class="hljs-keyword">varchar</span>(<span class="hljs-number">500</span>) , PageType bigint, IndexLevel bigint, NextPageFID bigint, NextPagePID bigint,PrevPageFID bigint, PrevPagePID bigint) <span class="hljs-keyword">create</span> index idx_pagefid <span class="hljs-keyword">on</span> #tempTabIndall(PageFID) ;</span> while @@FETCH_STATUS = 0 <span class="hljs-operator"><span class="hljs-keyword">begin</span> <span class="hljs-keyword">declare</span> cur_indexes <span class="hljs-keyword">cursor</span> <span class="hljs-keyword">for</span> (<span class="hljs-keyword">select</span> index_id <span class="hljs-keyword">from</span> sys.indexes <span class="hljs-keyword">where</span> object_id = object_id(@tablename)) <span class="hljs-keyword">open</span> cur_indexes <span class="hljs-keyword">fetch</span> <span class="hljs-keyword">next</span> <span class="hljs-keyword">from</span> cur_indexes <span class="hljs-keyword">into</span> @index_Id while @@FETCH_STATUS = <span class="hljs-number">0</span> <span class="hljs-keyword">begin</span> <span class="hljs-keyword">set</span> @sqlstatement = N<span class="hljs-string">'insert into #tempTabIndall exec sp_executesql N''DBCC IND('</span> + @databasename + <span class="hljs-string">','''''</span>+@tablename+<span class="hljs-string">''''','</span> + convert(<span class="hljs-keyword">varchar</span>(<span class="hljs-aggregate">max</span>),@index_Id)+<span class="hljs-string">')'''</span> ;</span> print @sqlstatement exec sp_executesql @sqlstatement fetch next from cur_indexes into @index_Id <span class="hljs-operator"><span class="hljs-keyword">end</span> <span class="hljs-keyword">close</span> cur_indexes <span class="hljs-keyword">deallocate</span> cur_indexes <span class="hljs-keyword">fetch</span> <span class="hljs-keyword">next</span> <span class="hljs-keyword">from</span> cur_tables <span class="hljs-keyword">into</span> @tablename <span class="hljs-keyword">end</span> <span class="hljs-keyword">close</span> cur_tables <span class="hljs-keyword">deallocate</span> cur_tables <span class="hljs-keyword">select</span> <span class="hljs-keyword">distinct</span> object_name(t.ObjectID) <span class="hljs-keyword">as</span> tablename , t.IndexId , ti.name <span class="hljs-keyword">as</span> IndexName , f.FileGroupName , f.Filegroup_type_description , f.DefaultFileGroup , f.datafile_type_description , f.fileName , f.file_physical_name <span class="hljs-keyword">from</span> #tempTabIndall t <span class="hljs-keyword">inner</span> <span class="hljs-keyword">join</span> (<span class="hljs-keyword">select</span> <span class="hljs-keyword">distinct</span> object_id,index_id,name <span class="hljs-keyword">from</span> sys.indexes) ti <span class="hljs-keyword">on</span> t.ObjectID = ti.object_id <span class="hljs-keyword">and</span> t.IndexId = ti.index_id <span class="hljs-keyword">left</span> <span class="hljs-keyword">join</span> ( <span class="hljs-keyword">select</span> isnull(data_file_id,<span class="hljs-number">0</span> ) <span class="hljs-keyword">as</span> data_file_id , isnull(g.FileGroupName,<span class="hljs-string">'LOG File Group'</span>) <span class="hljs-keyword">as</span> FileGroupName , isnull(g.type_desc,<span class="hljs-string">'LOG FILE GROUP'</span>) <span class="hljs-keyword">as</span> Filegroup_type_description , isnull(g.is_default,<span class="hljs-number">0</span>) <span class="hljs-keyword">as</span> DefaultFileGroup , f.type_desc <span class="hljs-keyword">as</span> datafile_type_description , f.name <span class="hljs-keyword">as</span> fileName , f.physical_name <span class="hljs-keyword">as</span> file_physical_name , f.state_desc <span class="hljs-keyword">as</span> datafilestatus , f.size_mb <span class="hljs-keyword">as</span> datafile_size_mb , f.max_size_mb <span class="hljs-keyword">as</span> datafile_max_size_mb <span class="hljs-keyword">from</span> ( <span class="hljs-keyword">select</span> name <span class="hljs-keyword">as</span> FileGroupName ,data_space_id ,type_desc ,is_default <span class="hljs-keyword">from</span> sys.filegroups ) g <span class="hljs-keyword">right</span> <span class="hljs-keyword">outer</span> <span class="hljs-keyword">join</span> ( <span class="hljs-keyword">select</span> file_id <span class="hljs-keyword">as</span> data_file_id ,type_desc ,data_space_id ,name ,physical_name ,state_desc ,<span class="hljs-keyword">size</span> * <span class="hljs-number">8</span> /<span class="hljs-number">1024</span> <span class="hljs-keyword">as</span> size_mb ,max_size * <span class="hljs-number">8</span> /<span class="hljs-number">1024</span> <span class="hljs-keyword">as</span> max_size_mb <span class="hljs-keyword">from</span> sys.database_files ) f <span class="hljs-keyword">on</span> g.data_space_id = f.data_space_id )f <span class="hljs-keyword">on</span> f.data_file_id = t.PageFID <span class="hljs-keyword">order</span> <span class="hljs-keyword">by</span> f.file_physical_name <span class="hljs-keyword">asc</span> ,object_name(t.ObjectID) <span class="hljs-keyword">asc</span>, t.IndexId <span class="hljs-keyword">asc</span></span></code>