Search Results

Search found 3236 results on 130 pages for 'ssis frameworks'.

Page 3/130 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • SSIS code smell – Unused columns in the dataflow

    - by jamiet
    A code smell is defined on Wikipedia as being a “symptom in the source code of a program that possibly indicates a deeper problem”. It’s a term commonly used by our code-writing brethren to describe sub-optimal code but I think the term can be applied equally well to SSIS packages too as I shall now explain One of my pet hates about SSIS development is packages that throw warnings of the form: The output column "ColumnName" (1358) on output "OLE DB Source Output" (1289) and component "OLE_SRC Name" (1279) is not subsequently used in the Data Flow task. Removing this unused output column can increase Data Flow task performance.  The warning is fairly self-explanatory – any column that appears in the data flow but doesn’t get used will throw this warning when the data flow is executed. Its not the negligible performance degradation that they cause that bothers me though, it’s the clutter that they cause in your log file/table. Take a look at the following screenshot if you don’t believe me: There are 231409 such warnings in the system that I took this screenshot from, that is 231409 log records that should not be there. The most infuriating thing about this warning is that it is so easily avoidable; eliminating such columns is a very quick and easy thing to do in the SSIS Designer. The only problem I see is that the warnings don’t occur until you execute the package – it would be preferable for the designer to have an unobtrusive way of informing you of them as well. Anyway, I digress… I consider such warnings to be a code smell because, to me, they’re symptomatic of a lack of due care and attention; a lack of developer discipline if you will. What other code smells can you think of when building SSIS packages? If I get a good list in the comments maybe I’ll compile them into a later blog post. @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • Learn SSIS in London 12-14 Sep 2012!

    - by andyleonard
    My friends at TechniTrain , the students, and I had a blast during the March 2012 London delivery of From Zero To SSIS ! We have decided to do it again in September 2012 with my new Learning SSIS 2012 3-day course ! Please find a course outline here . It is difficult to list everything I cover in the course, but the outline hits the high spots. This material grew out of my experiences serving as a consultant on short-term engagements and as a manager (and enterprise ETL architect) for a team of forty...(read more)

    Read the article

  • Learn SSIS in London 12-14 Sep 2012!

    - by andyleonard
    My friends at TechniTrain , the students, and I had a blast during the March 2012 London delivery of From Zero To SSIS ! We have decided to do it again in September 2012 with my new Learning SSIS 2012 3-day course ! Please find a course outline here . It is difficult to list everything I cover in the course, but the outline hits the high spots. This material grew out of my experiences serving as a consultant on short-term engagements and as a manager (and enterprise ETL architect) for a team of forty...(read more)

    Read the article

  • Developing a Custom SSIS Source Component

    SSIS was designed to be extensible. Although you can create tasks that will take data from a wide variety of sources, transform the data is a number of ways and write the results a wide choice of destinations, using the components provided, there will always be occasions when you need to customise your own SSIS component. Yes, it is time to hone up your C# skills and cut some code, as Saurabh explains.

    Read the article

  • SSIS Reporting Pack v0.4 – Execution Report updated

    - by jamiet
    SSIS Reporting Pack is a suite of reports that I maintain at http://ssisreportingpack.codeplex.com/ that provide visualisation over the SSIS Catalog in SQL Server 2012 and attempt to add value over the reports that ship in the box. Work on the reports has stalled (my last SSIS Reporting Pack blog post was on 4th September 2011) as I’ve had rather more important things going on my life of late however I have recently checked-in a fix that couldn’t really be delayed. I discovered a problem with the Execution report that was causing the report to effectively hang, it was caused by this bit of SQL hidden away in the report definition: [generated_executables] AS (   SELECT  [new_executable].[execution_path],[new_executable].[parent_execution_path]   FROM    (           SELECT  [execution_path] = SUBSTRING([loop_iteration].[execution_path] ,1, [loop_iteration].length_exec_path - [loop_iteration].[char_index_close_square] + 1)           ,       [parent_execution_path] = SUBSTRING([loop_iteration].[execution_path] ,1, [loop_iteration].length_exec_path - [loop_iteration].[char_index_open_square])           FROM    (                   SELECT  [execution_path]                   ,       [char_index_open_square] = CHARINDEX('[',REVERSE([execution_path]),1)                   ,       [char_index_close_square] = CHARINDEX(']',REVERSE([execution_path]),1)                   ,       [length_exec_path] = LEN([execution_path])                   FROM    [exec_stats] es                   WHERE   execution_path LIKE '%\[%]%'  ESCAPE '\'                   )AS [loop_iteration]           ) AS [new_executable]   GROUP   BY [new_executable].[execution_path],[new_executable].[parent_execution_path]) It was there because SSIS does not currently treat a loop iteration as an executable yet I figured there was still value in being able to view it as such – this SQL essentially “invents” new executables for those loop iterations; its what enabled the following visualisation: where each of the three iterations of a For Each Loop called “FEL Loop over top performing regions” appear in the report. Unfortunately, as I alluded, this could under certain circumstances (most likely when there were many loop iterations) cause the report to hang as it waited for the results to be constructed and returned. The change that I have made eradicates this generation of “fake” executables and thus produces this visualisation instead: Notice that the three “children” of the For Each Loop are no longer the three iterations but actually the task (“EPT Call Data Export Package”) contained within that For Each Loop. The problem here is of course that there is no longer a visual distinction between those three iterations; I have instead made the full execution path viewable via a tooltip:   If you preferred the “old” way of presenting this information and are happy to put up with the performance degradation then I have kept the old version of the report hanging around in the reporting pack as “execution loop with iterations” however none of the other reports link to it so you will have to browse to it manually if you want to use it. Please let me know if you ARE using it – I would be very interested to hear about your experiences.   The last change to make you aware of in the execution report is that by default I no longer show OnPreValidate or OnPostValidate messages as I consider them to be superfluous and only serve to clutter up the results. If you want to put them back, well, its open source so go right ahead!   The latest release of SSIS Reporting Pack that contains all of these changes is v0.4 and can be downloaded from http://ssisreportingpack.codeplex.com/releases/view/88178   Feedback on all of the above changes would be very much appreciated. @Jamiet

    Read the article

  • SSIS Reporting Pack v0.4 – Execution Report updated

    - by jamiet
    SSIS Reporting Pack is a suite of reports that I maintain at http://ssisreportingpack.codeplex.com/ that provide visualisation over the SSIS Catalog in SQL Server 2012 and attempt to add value over the reports that ship in the box. Work on the reports has stalled (my last SSIS Reporting Pack blog post was on 4th September 2011) as I’ve had rather more important things going on my life of late however I have recently checked-in a fix that couldn’t really be delayed. I discovered a problem with the Execution report that was causing the report to effectively hang, it was caused by this bit of SQL hidden away in the report definition: [generated_executables] AS (   SELECT  [new_executable].[execution_path],[new_executable].[parent_execution_path]   FROM    (           SELECT  [execution_path] = SUBSTRING([loop_iteration].[execution_path] ,1, [loop_iteration].length_exec_path - [loop_iteration].[char_index_close_square] + 1)           ,       [parent_execution_path] = SUBSTRING([loop_iteration].[execution_path] ,1, [loop_iteration].length_exec_path - [loop_iteration].[char_index_open_square])           FROM    (                   SELECT  [execution_path]                   ,       [char_index_open_square] = CHARINDEX('[',REVERSE([execution_path]),1)                   ,       [char_index_close_square] = CHARINDEX(']',REVERSE([execution_path]),1)                   ,       [length_exec_path] = LEN([execution_path])                   FROM    [exec_stats] es                   WHERE   execution_path LIKE '%\[%]%'  ESCAPE '\'                   )AS [loop_iteration]           ) AS [new_executable]   GROUP   BY [new_executable].[execution_path],[new_executable].[parent_execution_path]) It was there because SSIS does not currently treat a loop iteration as an executable yet I figured there was still value in being able to view it as such – this SQL essentially “invents” new executables for those loop iterations; its what enabled the following visualisation: where each of the three iterations of a For Each Loop called “FEL Loop over top performing regions” appear in the report. Unfortunately, as I alluded, this could under certain circumstances (most likely when there were many loop iterations) cause the report to hang as it waited for the results to be constructed and returned. The change that I have made eradicates this generation of “fake” executables and thus produces this visualisation instead: Notice that the three “children” of the For Each Loop are no longer the three iterations but actually the task (“EPT Call Data Export Package”) contained within that For Each Loop. The problem here is of course that there is no longer a visual distinction between those three iterations; I have instead made the full execution path viewable via a tooltip:   If you preferred the “old” way of presenting this information and are happy to put up with the performance degradation then I have kept the old version of the report hanging around in the reporting pack as “execution loop with iterations” however none of the other reports link to it so you will have to browse to it manually if you want to use it. Please let me know if you ARE using it – I would be very interested to hear about your experiences.   The last change to make you aware of in the execution report is that by default I no longer show OnPreValidate or OnPostValidate messages as I consider them to be superfluous and only serve to clutter up the results. If you want to put them back, well, its open source so go right ahead!   The latest release of SSIS Reporting Pack that contains all of these changes is v0.4 and can be downloaded from http://ssisreportingpack.codeplex.com/releases/view/88178   Feedback on all of the above changes would be very much appreciated. @Jamiet

    Read the article

  • Working with Decimal fields in SSIS

    - by CoffeeAddict
    I'm using SQL Server 2008 w/SP2. I've got an incoming decimal(9,2) field incoming through my OLE DB transformation to my recordset destination transformation. It's like it's reading it as something other than a decimal? I don't know..I'm not an SSIS guru. So continuing on...the problem I have starts here with me trying to stuff the value into a variable for this decimal field. In a foreach loop, I have a variable to represent this decimal field so I can work with it. The first problem that I believe is pretty well known is SSIS variables do not have a decimal type. And from my own testing and what I've read out there, people are using type object for the variable to make SSIS "happy" with decimal values? It makes mine happy. But, then in my foreach loop, I have a for loop. And inside that I'm using an E*xecute SQL Task transformation*. In it, I need to create a parameter mapping to my variable so I can work with that decimal field in my T-SQL call in here. So now I see a type decimal for the parameter and use it and set that to point to my variable. When I run SSIS and it hits my SQL call, I get this in my output window.: The type is not supported.DBTYPE_DECIMAL So I am hitting a wall here. All I wanna do is work with a decimal!!!

    Read the article

  • ODBC in SSIS 2012

    - by jamiet
    In August 2011 the SQL Server client team published a blog post entitled Microsoft is Aligning with ODBC for Native Relational Data Access in which they basically said "OLE DB is the past, ODBC is the future. Deal with it.". From that blog post:We encourage you to adopt ODBC in the development of your new and future versions of your application. You don’t need to change your existing applications using OLE DB, as they will continue to be supported on Denali throughout its lifecycle. While this gives you a large window of opportunity for changing your applications before the deprecation goes into effect, you may want to consider migrating those applications to ODBC as a part of your future roadmap.I recently undertook a project using SSIS2012 and heeded that advice by opting to use ODBC Connection Managers rather than OLE DB Connection Managers. Unfortunately my finding was that the ODBC Connection Manager is not yet ready for primetime use in SSIS 2012. The main issue I found was that you can't populate an Object variable with a recordset when using an Execute SQL Task connecting to an ODBC data source; any attempt to do so will result in an error:"Disconnected recordsets are not available from ODBC connections." I have filed a bug on Connect at ODBC Connection Manager does not have same funcitonality as OLE DB. For this reason I strongly recommend that you don't make the move to ODBC Connection Managers in SSIS just yet - best to wait for the next version of SSIS before doing that.I found another couple of issues with the ODBC Connection Manager that are worth keeping in mind:It doesn't recognise System Data Source Names (DSNs), only User DSNs (bug filed at ODBC System DSNs are not available in the ODBC Connection Manager)  UPDATE: According to a comment on that Connect item this may only be a problem on 64bit.In the OLE DB Connection Manager parameter ordinals are 0-based, in the ODBC Connection Manager they are 1-based (oh I just can't wait for the upgrade mess that ensues from this one!!!)You have been warned!@jamiet

    Read the article

  • SSIS Dashboard v0.4

    - by Davide Mauri
    Following the post on SSISDB script on Gist, I’ve been working on a HTML5 SSIS Dashboard, in order to have a nice looking, user friendly and, most of all, useful, SSIS Dashboard. Since this is a “spare-time” project, I’ve decided to develop it using Python since it’s THE data language (R aside), it’s a beautiful & powerful, well established and well documented and with a rich ecosystem around. Plus it has full support in Visual Studio, through the amazing Python Tools For Visual Studio plugin, I decided also to use Flask, a very good micro-framework to create websites, and use the SB Admin 2.0 Bootstrap admin template, since I’m all but a Web Designer. The result is here: https://github.com/yorek/ssis-dashboard and I can say I’m pretty satisfied with the work done so far (I’ve worked on it for probably less than 24 hours). Though there’s some features I’d like to add in t future (historical execution time, some charts, connection with AzureML to do prediction on expected execution times) it’s already usable. Of course I’ve tested it only on my development machine, so check twice before putting it in production but, give the fact that, virtually, there is not installation needed (you only need to install Python), and that all queries are based on standard SSISDB objects, I expect no big problems. If you want to test, contribute and/or give feedback please fell free to do it…I would really love to see this little project become a community project! Enjoy!

    Read the article

  • The one feature that would make me invest in SSIS 2012

    - by Peter Larsson
    This week I was invited my Microsoft to give two presentations in Slovenia. My presentations went well and I had good energy and the audience was interacting with me. When I had some time over from networking and partying, I attended a few other presentations. At least the ones who where held in English. One of these was "SQL Server Integration Services 2012 - All the News, and More", given by Davide Mauri, a fellow co-worker from SolidQ. We started to talk and soon came into the details of the new things in SSIS 2012. All of the official things Davide talked about are good stuff, but for me, the best thing is one he didn't cover in his presentation. In earlier versions of SSIS than 2012, it is possible to have a stored procedure to act as a data source, as long as it doesn't have a temp table in it. In that case, you will get an error message from SSIS that "Metadata could not be found". This is still true with SSIS 2012, so the thing I am talking about is not really a SSIS feature, it's a SQL Server 2012 feature. And this is the EXECUTE WITH RESULTSETS feature! With this, you can have a stored procedure with a temp table to deliver the resultset to SSIS, if you execute the stored procedure from SSIS and add the "WITH RESULTSETS" option. If you do this, SSIS is able to take the metadata from the code you write in SSIS and not from the stored procedure! And it's very fast too. Let's say you have a stored procedure in earlier versions and when referencing that stored procedure in SSIS forced SSIS to call the stored procedure (which can take hours), to retrieve the metadata. Now, with RESULTSETS, SSIS 2012 can continue in milliseconds! This is because you provide the metadata in the RESULTSETS clause, and if the data from the stored procedure doesn't match this RESULTSETS, you will get an error anyway, so it makes sense Microsoft has provided this optimization for us.

    Read the article

  • ODBC in SSIS 2012

    - by jamiet
    In August 2011 the SQL Server client team published a blog post entitled Microsoft is Aligning with ODBC for Native Relational Data Access in which they basically said "OLE DB is the past, ODBC is the future. Deal with it.". From that blog post:We encourage you to adopt ODBC in the development of your new and future versions of your application. You don’t need to change your existing applications using OLE DB, as they will continue to be supported on Denali throughout its lifecycle. While this gives you a large window of opportunity for changing your applications before the deprecation goes into effect, you may want to consider migrating those applications to ODBC as a part of your future roadmap.I recently undertook a project using SSIS2012 and heeded that advice by opting to use ODBC Connection Managers rather than OLE DB Connection Managers. Unfortunately my finding was that the ODBC Connection Manager is not yet ready for primetime use in SSIS 2012. The main issue I found was that you can't populate an Object variable with a recordset when using an Execute SQL Task connecting to an ODBC data source; any attempt to do so will result in an error:"Disconnected recordsets are not available from ODBC connections." I have filed a bug on Connect at ODBC Connection Manager does not have same funcitonality as OLE DB. For this reason I strongly recommend that you don't make the move to ODBC Connection Managers in SSIS just yet - best to wait for the next version of SSIS before doing that.I found another couple of issues with the ODBC Connection Manager that are worth keeping in mind:It doesn't recognise System Data Source Names (DSNs), only User DSNs (bug filed at ODBC System DSNs are not available in the ODBC Connection Manager)  UPDATE: According to a comment on that Connect item this may only be a problem on 64bit.In the OLE DB Connection Manager parameter ordinals are 0-based, in the ODBC Connection Manager they are 1-based (oh I just can't wait for the upgrade mess that ensues from this one!!!)You have been warned!@jamiet

    Read the article

  • Is the abundance of Frameworks dumbing down programmers?

    - by Gratzy
    With all of the frameworks available these days ORM's DI/IoC etc. I find that many programmers are losing or don't have the problem solving skills needed to solve difficult issues. I've seen many times unexpected behaviour creep into applications and the developers unable to really dig in and find the issues. It seems to me that deep understanding of whats going on under the hood is being lost. Don't get me wrong, I'm not suggesting these frameworks aren't good and haven't moved the industry forward, only asking if as a unintended consequence developers aren't gaining the knowledge and skill needed for deep understanding of systems.

    Read the article

  • Merge Join component sorted outputs [SSIS]

    - by jamiet
    One question that I have been asked a few times of late in regard to performance tuning SSIS data flows is this: Why isn’t the Merge Join output sorted (i.e.IsSorted=True)? This is a fair question. After all both of the Merge Join inputs are sorted, hence why wouldn’t the output be sorted as well? Well here’s a little secret, the Merge Join output IS sorted! There’s a caveat though – it is only under certain circumstances and SSIS itself doesn’t do a good job of informing you of it. Let’s take a look at an example. Here we have a dataflow that consumes data from the [AdventureWorks2008].[Sales].[SalesOrderHeader] & [AdventureWorks2008].[Sales].[SalesOrderDetail] tables then joins them using a Merge Join component: Let’s take a look inside the editor of the Merge Join: We are joining on the [SalesOrderId] field (which is what the two inputs just happen to be sorted upon). We are also putting [SalesOrderHeader].[SalesOrderId] into the output. Believe it or not the output from this Merge Join component is sorted (i.e. has IsSorted=True) but unfortunately the Merge Join component does not have an Advanced Editor hence it is hidden away from us. There are a couple of ways to prove to you that is the case; I could open up the package XML inside the .dtsx file and show you the metadata but there is an easier way than that – I can attach a Sort component to the output. Take a look: Notice that the Sort component is attempting to sort on the [SalesOrderId] column. This gives us the following warning: Validation warning. DFT Get raw data: {992B7C9A-35AD-47B9-A0B0-637F7DDF93EB}: The data is already sorted as specified so the transform can be removed. The warning proves that the output from the Merge Join is sorted! It must be noted that the Merge Join output will only have IsSorted=True if at least one of the join columns is included in the output. So there you go, the Merge Join component can indeed produce a sorted output and that’s very useful in order to avoid unnecessary expensive Sort operations downstream. Hope this is useful to someone out there! @Jamiet  P.S. Thank you to Bob Bojanic on the SSIS product team who pointed this out to me!

    Read the article

  • REPLACENULL in SSIS 2012

    - by Davide Mauri
    While preparing my slides e demos for the forthcoming SQL Server Conference 2012 in Italy, I’ve come across a nice addition to DTS Expression language which I never noticed before and that seems unknown also to the blogosphere: REPLACENULL. REPLACENULL is the same of ISNULL in T-SQL. It’s *very* useful especially when loading a fact table of your BI solution when you need to replace unexisting reference to dimension with dummy values. Here’s an example of how it can be used (please notice that in this example I’m NOT loading a fact table): I’ve noticed that the feature was requested by fellow MVP John Welch http://connect.microsoft.com/SQLServer/feedback/details/636057/ssis-add-a-replacenull-function-to-the-expression-language So: Thanks John and Thanks SSIS Team ! Ah, btw, the Help online is here http://msdn.microsoft.com/en-us/library/hh479601(v=sql.110).aspx Enjoy!

    Read the article

  • SSIS: Building SQL databases on-the-fly using concatenated SQL scripts

    - by DrJohn
    Over the years I have developed many techniques which help automate the whole SQL Server build process. In my current process, where I need to build entire OLAP data marts on-the-fly, I make regular use of a simple but very effective mechanism to concatenate all the SQL Scripts together from my SSMS (SQL Server Management Studio) projects. This proves invaluable because in two clicks I can redeploy an entire SQL Server database with all tables, views, stored procedures etc. Indeed, I can also use the concatenated SQL scripts with SSIS to build SQL Server databases on-the-fly. You may be surprised to learn that I often redeploy the database several times per day, or even several times per hour, during the development process. This is because the deployment errors are logged and you can quickly see where SQL Scripts have object dependency errors. For example, after changing a table structure you may have forgotten to change any related views. The deployment log immediately points out all the objects which failed to build so you can fix and redeploy the database very quickly. The alternative approach (i.e. doing changes in the database directly using the SSMS UI) would require you to check all dependent objects before making changes. The chances are that you will miss something and wonder why your app returns the wrong data – a common problem caused by changing a table without re-creating dependent views. Using SQL Projects in SSMS A great many developers fail to make use of SQL Projects in SSMS (SQL Server Management Studio). To me they are invaluable way of organizing your SQL Scripts. The screenshot below shows a typical SSMS solution made up of several projects – one project for tables, another for views etc. The key point is that the projects naturally fall into the right order in file system because of the project name. The number in the folder or file name ensures that the projects the SQL scripts are concatenated together in the order that they need to be executed. Hence the script filenames start with 100, 110 etc. Concatenating SQL Scripts To concatenate the SQL Scripts together into one file, I use notepad.exe to create a simple batch file (see example screenshot) which uses the TYPE command to write the content of the SQL Script files into a combined file. As the SQL Scripts are in several folders, I simply use several TYPE command multiple times and append the output together. If you are unfamiliar with batch files, you may not know that the angled bracket (>) means write output of the program into a file. Two angled brackets (>>) means append output of this program into a file. So the command-line DIR > filelist.txt would write the content of the DIR command into a file called filelist.txt. In the example shown above, the concatenated file is called SB_DDS.sql If, like me you place the concatenated file under source code control, then the source code control system will change the file's attribute to "read-only" which in turn would cause the TYPE command to fail. The ATTRIB command can be used to remove the read-only flag. Using SQLCmd to execute the concatenated file Now that the SQL Scripts are all in one big file, we can execute the script against a database using SQLCmd using another batch file as shown below: SQLCmd has numerous options, but the script shown above simply executes the SS_DDS.sql file against the SB_DDS_DB database on the local machine and logs the errors to a file called SB_DDS.log. So after executing the batch file you can simply check the error log to see if your database built without a hitch. If you have errors, then simply fix the source files, re-create the concatenated file and re-run the SQLCmd to rebuild the database. This two click operation allows you to quickly identify and fix errors in your entire database definition.Using SSIS to execute the concatenated file To execute the concatenated SQL script using SSIS, you simply drop an Execute SQL task into your package and set the database connection as normal and then select File Connection as the SQLSourceType (as shown below). Create a file connection to your concatenated SQL script and you are ready to go.   Tips and TricksAdd a new-line at end of every fileThe most common problem encountered with this approach is that the GO statement on the last line of one file is placed on the same line as the comment at the top of the next file by the TYPE command. The easy fix to this is to ensure all your files have a new-line at the end.Remove all USE database statementsThe SQLCmd identifies which database the script should be run against.  So you should remove all USE database commands from your scripts - otherwise you may get unintentional side effects!!Do the Create Database separatelyIf you are using SSIS to create the database as well as create the objects and populate the database, then invoke the CREATE DATABASE command against the master database using a separate package before calling the package that executes the concatenated SQL script.    

    Read the article

  • Merge Join component sorted outputs [SSIS]

    - by jamiet
    One question that I have been asked a few times of late in regard to performance tuning SSIS data flows is this: Why isn’t the Merge Join output sorted (i.e.IsSorted=True)? This is a fair question. After all both of the Merge Join inputs are sorted, hence why wouldn’t the output be sorted as well? Well here’s a little secret, the Merge Join output IS sorted! There’s a caveat though – it is only under certain circumstances and SSIS itself doesn’t do a good job of informing you of it. Let’s take a look at an example. Here we have a dataflow that consumes data from the [AdventureWorks2008].[Sales].[SalesOrderHeader] & [AdventureWorks2008].[Sales].[SalesOrderDetail] tables then joins them using a Merge Join component: Let’s take a look inside the editor of the Merge Join: We are joining on the [SalesOrderId] field (which is what the two inputs just happen to be sorted upon). We are also putting [SalesOrderHeader].[SalesOrderId] into the output. Believe it or not the output from this Merge Join component is sorted (i.e. has IsSorted=True) but unfortunately the Merge Join component does not have an Advanced Editor hence it is hidden away from us. There are a couple of ways to prove to you that is the case; I could open up the package XML inside the .dtsx file and show you the metadata but there is an easier way than that – I can attach a Sort component to the output. Take a look: Notice that the Sort component is attempting to sort on the [SalesOrderId] column. This gives us the following warning: Validation warning. DFT Get raw data: {992B7C9A-35AD-47B9-A0B0-637F7DDF93EB}: The data is already sorted as specified so the transform can be removed. The warning proves that the output from the Merge Join is sorted! It must be noted that the Merge Join output will only have IsSorted=True if at least one of the join columns is included in the output. So there you go, the Merge Join component can indeed produce a sorted output and that’s very useful in order to avoid unnecessary expensive Sort operations downstream. Hope this is useful to someone out there! @Jamiet  P.S. Thank you to Bob Bojanic on the SSIS product team who pointed this out to me!

    Read the article

  • Export all SSIS packages from msdb using Powershell

    - by jamiet
    Have you ever wanted to dump all the SSIS packages stored in msdb out to files? Of course you have, who wouldn’t? Right? Well, at least one person does because this was the subject of a thread (save all ssis packages to file) on the SSIS forum earlier today. Some of you may have already figured out a way of doing this but for those that haven’t here is a nifty little script that will do it for you and it uses our favourite jack-of-all tools … Powershell!!   Imagine I have the following package folder structure on my Integration Services server (i.e. in [msdb]): There are two packages in there called “20110111 Chaining Expression components” & “Package”, I want to export those two packages into a folder structure that mirrors that in [msdb]. Here is the Powershell script that will do that:   Param($SQLInstance = "localhost") #####Add all the SQL goodies (including Invoke-Sqlcmd)##### add-pssnapin sqlserverprovidersnapin100 -ErrorAction SilentlyContinue add-pssnapin sqlservercmdletsnapin100 -ErrorAction SilentlyContinue cls $Packages = Invoke-Sqlcmd -MaxCharLength 10000000 -ServerInstance $SQLInstance -Query "WITH cte AS ( SELECT cast(foldername as varchar(max)) as folderpath, folderid FROM msdb..sysssispackagefolders WHERE parentfolderid = '00000000-0000-0000-0000-000000000000' UNION ALL SELECT cast(c.folderpath + '\' + f.foldername as varchar(max)), f.folderid FROM msdb..sysssispackagefolders f INNER JOIN cte c ON c.folderid = f.parentfolderid ) SELECT c.folderpath,p.name,CAST(CAST(packagedata AS VARBINARY(MAX)) AS VARCHAR(MAX)) as pkg FROM cte c INNER JOIN msdb..sysssispackages p ON c.folderid = p.folderid WHERE c.folderpath NOT LIKE 'Data Collector%'" Foreach ($pkg in $Packages) { $pkgName = $Pkg.name $folderPath = $Pkg.folderpath $fullfolderPath = "c:\temp\$folderPath\" if(!(test-path -path $fullfolderPath)) { mkdir $fullfolderPath | Out-Null } $pkg.pkg | Out-File -Force -encoding ascii -FilePath "$fullfolderPath\$pkgName.dtsx" }   To run it simply change the “localhost” parameter of the server you want to connect to either by editing the script or passing it in when the script is executed. It will create the folder structure in C:\Temp (which you can also easily change if you so wish – just edit the script accordingly). Here’s the folder structure that it created for me: Notice how it is a mirror of the folder structure in [msdb]. Hope this is useful! @Jamiet

    Read the article

  • SSIS Denali CTP1 Source Assistant

    - by andyleonard
    I like the new Data Flow Source Assistant in SSIS Denali. The default view is shown above, with the "Show installed only" checkbox checked. When not checked, the list of Source types changes: In previous versions of SSIS, I rarely created connections in the Connection Managers pane - I usually hit a New button in either a Source or Destination Adapter, or in a task. It was just easier letting the task or adapter pick the proper Connection Manager editor. This is handy and a time-saver. :{>...(read more)

    Read the article

  • Learning frameworks without learning languages

    - by Tom Morris
    I've been reading up on GUI frameworks including WPF, GTK and Cocoa (UIKit). I don't really do anything related to Windows (I'm a Mac and Linux guy) or .NET, but I'd like to be able to throw together GUIs for various operating systems. We are in the enviable position now of having high level scripting languages that work with all of the major GUI toolkits. If you are doing Linux GUI programming, you could use GTK in C, but why not just use PyGTK (or PyQt). Similarly, for Java, one can use JRuby. For Mac, there's MacRuby. And on .NET, there's IronRuby. This is all fine and good, and if you are building a serious project, there are tradeoffs that you might encounter when deciding whether to, say, build a WPF app in C# or in IronRuby, or whether you are going to use PyGTK or not. The subjective question I have is: what about learning those frameworks? Are there strong reasons why one should or should not learn something like WPF or Cocoa in a language one is familiar with rather than having to learn a new language as well? I'm not saying you should never learn the language. If you are building Windows applications and you don't know C#, that might be a bit of a problem. But do you think it is okay to learn the framework first? This is both a general question and a specific question. I've used some Cocoa classes from Ruby and Python using things like PyObjC and there always seems to be an impedance mismatch because of the way Objective C libraries get built. Experiences and strong opinions welcome!

    Read the article

  • PHP frameworks - should I use them?

    - by 30secondstosam
    First of all let me explain who I am: I am a PHP Developer working for a company developing their CMS which handles online stores, data feeds and other content like blogs. I have been programming for 6 years and 4 of those have been in PHP. (I used to be a C# developer). My question is regarding PHP frameworks. I see so many jobs asking for zend, cakephp and other MVC types of frameworks. However, even as an experienced developer I have never used a PHP framework. Should I start learning? If so where do I even start as there are so many. I am about to re-write so much of my work's CMS and I'm wondering whether I could help myself a lot by using a framework. What do people think? Considering I will be re-writing a lot of the online store stuff... I am experienced in OO and I have used the .NET framework in the past. Thanks in advance for the replies.

    Read the article

  • The SSIS tuning tip that everyone misses

    - by Rob Farley
    I know that everyone misses this, because I’m yet to find someone who doesn’t have a bit of an epiphany when I describe this. When tuning Data Flows in SQL Server Integration Services, people see the Data Flow as moving from the Source to the Destination, passing through a number of transformations. What people don’t consider is the Source, getting the data out of a database. Remember, the source of data for your Data Flow is not your Source Component. It’s wherever the data is, within your database, probably on a disk somewhere. You need to tune your query to optimise it for SSIS, and this is what most people fail to do. I’m not suggesting that people don’t tune their queries – there’s plenty of information out there about making sure that your queries run as fast as possible. But for SSIS, it’s not about how fast your query runs. Let me say that again, but in bolder text: The speed of an SSIS Source is not about how fast your query runs. If your query is used in a Source component for SSIS, the thing that matters is how fast it starts returning data. In particular, those first 10,000 rows to populate that first buffer, ready to pass down the rest of the transformations on its way to the Destination. Let’s look at a very simple query as an example, using the AdventureWorks database: We’re picking the different Weight values out of the Product table, and it’s doing this by scanning the table and doing a Sort. It’s a Distinct Sort, which means that the duplicates are discarded. It'll be no surprise to see that the data produced is sorted. Obvious, I know, but I'm making a comparison to what I'll do later. Before I explain the problem here, let me jump back into the SSIS world... If you’ve investigated how to tune an SSIS flow, then you’ll know that some SSIS Data Flow Transformations are known to be Blocking, some are Partially Blocking, and some are simply Row transformations. Take the SSIS Sort transformation, for example. I’m using a larger data set for this, because my small list of Weights won’t demonstrate it well enough. Seven buffers of data came out of the source, but none of them could be pushed past the Sort operator, just in case the last buffer contained the data that would be sorted into the first buffer. This is a blocking operation. Back in the land of T-SQL, we consider our Distinct Sort operator. It’s also blocking. It won’t let data through until it’s seen all of it. If you weren’t okay with blocking operations in SSIS, why would you be happy with them in an execution plan? The source of your data is not your OLE DB Source. Remember this. The source of your data is the NCIX/CIX/Heap from which it’s being pulled. Picture it like this... the data flowing from the Clustered Index, through the Distinct Sort operator, into the SELECT operator, where a series of SSIS Buffers are populated, flowing (as they get full) down through the SSIS transformations. Alright, I know that I’m taking some liberties here, because the two queries aren’t the same, but consider the visual. The data is flowing from your disk and through your execution plan before it reaches SSIS, so you could easily find that a blocking operation in your plan is just as painful as a blocking operation in your SSIS Data Flow. Luckily, T-SQL gives us a brilliant query hint to help avoid this. OPTION (FAST 10000) This hint means that it will choose a query which will optimise for the first 10,000 rows – the default SSIS buffer size. And the effect can be quite significant. First let’s consider a simple example, then we’ll look at a larger one. Consider our weights. We don’t have 10,000, so I’m going to use OPTION (FAST 1) instead. You’ll notice that the query is more expensive, using a Flow Distinct operator instead of the Distinct Sort. This operator is consuming 84% of the query, instead of the 59% we saw from the Distinct Sort. But the first row could be returned quicker – a Flow Distinct operator is non-blocking. The data here isn’t sorted, of course. It’s in the same order that it came out of the index, just with duplicates removed. As soon as a Flow Distinct sees a value that it hasn’t come across before, it pushes it out to the operator on its left. It still has to maintain the list of what it’s seen so far, but by handling it one row at a time, it can push rows through quicker. Overall, it’s a lot more work than the Distinct Sort, but if the priority is the first few rows, then perhaps that’s exactly what we want. The Query Optimizer seems to do this by optimising the query as if there were only one row coming through: This 1 row estimation is caused by the Query Optimizer imagining the SELECT operation saying “Give me one row” first, and this message being passed all the way along. The request might not make it all the way back to the source, but in my simple example, it does. I hope this simple example has helped you understand the significance of the blocking operator. Now I’m going to show you an example on a much larger data set. This data was fetching about 780,000 rows, and these are the Estimated Plans. The data needed to be Sorted, to support further SSIS operations that needed that. First, without the hint. ...and now with OPTION (FAST 10000): A very different plan, I’m sure you’ll agree. In case you’re curious, those arrows in the top one are 780,000 rows in size. In the second, they’re estimated to be 10,000, although the Actual figures end up being 780,000. The top one definitely runs faster. It finished several times faster than the second one. With the amount of data being considered, these numbers were in minutes. Look at the second one – it’s doing Nested Loops, across 780,000 rows! That’s not generally recommended at all. That’s “Go and make yourself a coffee” time. In this case, it was about six or seven minutes. The faster one finished in about a minute. But in SSIS-land, things are different. The particular data flow that was consuming this data was significant. It was being pumped into a Script Component to process each row based on previous rows, creating about a dozen different flows. The data flow would take roughly ten minutes to run – ten minutes from when the data first appeared. The query that completes faster – chosen by the Query Optimizer with no hints, based on accurate statistics (rather than pretending the numbers are smaller) – would take a minute to start getting the data into SSIS, at which point the ten-minute flow would start, taking eleven minutes to complete. The query that took longer – chosen by the Query Optimizer pretending it only wanted the first 10,000 rows – would take only ten seconds to fill the first buffer. Despite the fact that it might have taken the database another six or seven minutes to get the data out, SSIS didn’t care. Every time it wanted the next buffer of data, it was already available, and the whole process finished in about ten minutes and ten seconds. When debugging SSIS, you run the package, and sit there waiting to see the Debug information start appearing. You look for the numbers on the data flow, and seeing operators going Yellow and Green. Without the hint, I’d sit there for a minute. With the hint, just ten seconds. You can imagine which one I preferred. By adding this hint, it felt like a magic wand had been waved across the query, to make it run several times faster. It wasn’t the case at all – but it felt like it to SSIS.

    Read the article

  • SSIS Script Component Testing Strategy

    - by Paul Kohler
    This question is in respect to the script component specifically. I am aware of ssisUnit etc… With simple SSIS Scripts Components, it’s sufficient to let basic testing flesh out issues, however I am working with a script that has grown in complexity over time. To better test the functionality I am considering abstracting the script logic into a DLL that gets deployed with the package, and then use the custom component in the script. The advantage is that the function will be more testable etc but it’s one more deployment artefact that needs to be managed. My question is, does anyone know of a better way to test such an SSIS script in a more isolated manner than to run the whole package and examine the output?

    Read the article

  • FileNameColumnName property, Flat File Source Adapter : SSIS Nugget

    - by jamiet
    I saw a question on MSDN’s SSIS forum the other day that went something like this: I’m loading data into a table from a flat file but I want to be able to store the name of that file as well. Is there a way of doing that? I don’t want to come across as disrespecting those who took the time to reply but there was a few answers along the lines of “loop over the files using a For Each, store the file name in a variable yadda yadda yadda” when in fact there is a much much simpler way of accomplishing this; it just happens to be a little hidden away as I shall now explain! The Flat File Source Adapter has a property called FileNameColumnName which for some reason it isn’t exposed through the Flat File Source editor, it is however exposed via the Advanced Properties: You’ll see in the screenshot above that I have set FileNameColumnName=“Filename” (it doesn’t matter what name you use, anything except a non-zero string will work). What this will do is create a new column in our dataflow called “Filename” that contains, unsurprisingly, the name of the file from which the row was sourced. All very simple. This is particularly useful if you are extracting data from multiple files using the MultiFlatFile Connection Manager as it allows you to differentiate between data from each of the files as you can see in the following screenshot: So there you have it, the FileNameColumnName property; a little known secret of SSIS. I hope it proves to be useful to someone out there. @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • FileNameColumnName property, Flat File Source Adapter : SSIS Nugget

    - by jamiet
    I saw a question on MSDN’s SSIS forum the other day that went something like this: I’m loading data into a table from a flat file but I want to be able to store the name of that file as well. Is there a way of doing that? I don’t want to come across as disrespecting those who took the time to reply but there was a few answers along the lines of “loop over the files using a For Each, store the file name in a variable yadda yadda yadda” when in fact there is a much much simpler way of accomplishing this; it just happens to be a little hidden away as I shall now explain! The Flat File Source Adapter has a property called FileNameColumnName which for some reason it isn’t exposed through the Flat File Source editor, it is however exposed via the Advanced Properties: You’ll see in the screenshot above that I have set FileNameColumnName=“Filename” (it doesn’t matter what name you use, anything except a non-zero string will work). What this will do is create a new column in our dataflow called “Filename” that contains, unsurprisingly, the name of the file from which the row was sourced. All very simple. This is particularly useful if you are extracting data from multiple files using the MultiFlatFile Connection Manager as it allows you to differentiate between data from each of the files as you can see in the following screenshot: So there you have it, the FileNameColumnName property; a little known secret of SSIS. I hope it proves to be useful to someone out there. @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • SSIS – Delete all files except for the most recent one

    - by jorg
    Quite often one or more sources for a data warehouse consist of flat files. Most of the times these files are delivered as a zip file with a date in the file name, for example FinanceDataExport_20100528.zip Currently I work at a project that does a full load into the data warehouse every night. A zip file with some flat files in it is dropped in a directory on a daily basis. Sometimes there are multiple zip files in the directory, this can happen because the ETL failed or somebody puts a new zip file in the directory manually. Because the ETL isn’t incremental only the most recent file needs to be loaded. To implement this I used the simple code below; it checks which file is the most recent and deletes all other files. Note: In a previous blog post I wrote about unzipping zip files within SSIS, you might also find this useful: SSIS – Unpack a ZIP file with the Script Task Public Sub Main() 'Use this piece of code to loop through a set of files in a directory 'and delete all files except for the most recent one based on a date in the filename. 'File name example: 'DataExport_20100413.zip Dim rootDirectory As New DirectoryInfo(Dts.Variables("DirectoryFromSsisVariable").Value.ToString) Dim mostRecentFile As String = "" Dim currentFileDate As Integer Dim mostRecentFileDate As Integer = 0 'Check which file is the most recent For Each fi As FileInfo In rootDirectory.GetFiles("*.zip") currentFileDate = CInt(Left(Right(fi.Name, 12), 8)) 'Get date from current filename (based on a file that ends with: YYYYMMDD.zip) If currentFileDate > mostRecentFileDate Then mostRecentFileDate = currentFileDate mostRecentFile = fi.Name End If Next 'Delete all files except the most recent one For Each fi As FileInfo In rootDirectory.GetFiles("*.zip") If fi.Name <> mostRecentFile Then File.Delete(rootDirectory.ToString + "\" + fi.Name) End If Next Dts.TaskResult = ScriptResults.Success End Sub Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >