SSISPerformance-Parallelism

WBOY
풀어 주다: 2016-06-07 15:54:04
원래의
1347명이 탐색했습니다.

Parallelism exists almost in every field after multi-core processor come into play, and SSIS is not an exception. SSIS allow us configuration the parallelism in two different granularities: Packge Level By set the MaxConcurrentExecutables

Parallelism exists almost in every field after multi-core processor come into play, and SSIS is not an exception. SSIS allow us configuration the parallelism in two different granularities:

Packge Level

By set the MaxConcurrentExecutables property within the package, we indicate SSIS engine how many Executables can run simultaneously. The default value is -1 which means the number of processor plus 2.

\

Now let"s do a very simple pratice. I create three Data Flow Tasks in the package and Set the MaxConcurrentExecutables property to 2 which means just 2 executables are allowed to run simultaneously. Then I set breadpoint on all of them:

\ 喎?http://www.2cto.com/kf/ware/vc/" target="_blank" class="keylink">vcD4KPHA+VGhlbiBsZXQ="s run the package, you will find only two tasks are running now, the third one need to wait until one of them finish:

\

Then let"s set the MaxConcurrentExecutables to 3 and execute the package again, we can see the three tasks are running simultaneously:

\

Data Flow Level

Now we have 3 executables(Data Flow tasks) in the package and all of them will run simultaneously after we set MaxConcurrentExecutables = 3. Then let's get into the Data Flow task, the EngineThreads property within the Data Flow indicate the number of threads that data flow task can use during execution.

\

It is a little obscure when we see the definition at the first glance. So let me make a simple explanation about the background. In general Data Flow task is the only place where SSIS do E-T-L(you may say we ca do this using Execute SQL Task, but in that case it is the SQL Server engine doing the ETL and SSIS just make a call), and in the simplest scenario, if Data Flow just extract data from source and then load the data into destination, we need one buffer and two threads: one is the used to extract data from source named Source Thread, another one is used for transformation/destination named Worker Thread.

\

But that"s only the simplest scenario, in most cases the Data Flow will do some transformations(Like Union, Lookup, Derived Column etc.) and so need more threads. SSIS use the concept Execution Tree for this: one Execution Tree means SSIS must create a buffer and need a thread.

Now I create 4 Source -> Destination in every Data Flows task which means there are 4 execution trees for every Data Flow task, and also it means SSIS need 4 worker threads if we want all of them run simultaneously.

If we set EngineThreads = 2, then only two of those Source->Destination can run simultaneously(When I do pratice base on SQL Server 2012, I found all of those 4 run simultaneously, I am still wondering why..... and will update this once I find the answer.).

관련 라벨:
원천:php.cn
본 웹사이트의 성명
본 글의 내용은 네티즌들의 자발적인 기여로 작성되었으며, 저작권은 원저작자에게 있습니다. 본 사이트는 이에 상응하는 법적 책임을 지지 않습니다. 표절이나 침해가 의심되는 콘텐츠를 발견한 경우 admin@php.cn으로 문의하세요.
인기 튜토리얼
더>
최신 다운로드
더>
웹 효과
웹사이트 소스 코드
웹사이트 자료
프론트엔드 템플릿