Search Results

Search found 1934 results on 78 pages for 'ssis catalog'.

Page 2/78 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • MicroTraining: Executing SSIS 2012 Packages 22 May 10:00 AM EDT (Free!)

    - by andyleonard
    I am pleased to announce the latest (free!) Linchpin People microtraining event will be held Tuesday 22 May 2012 at 10:00 AM EDT. The topic will be Executing SSIS 2012 Packages. In this presentation, I will be demonstrating several ways to execute SSIS 2012 packages. Register here ! Interested in learning about more microtraining from Linchpin People – before anyone else? Sign up for our newsletter ! :{>...(read more)

    Read the article

  • T-SQL Tuesday: Aggregations in SSIS

    - by andyleonard
    Introduction Jes Borland ( Blog | @grrl_geek ) is hosting this month's T-SQL Tuesday - started by SQLBlog's own Adam Machanic ( Blog | @AdamMachanic ) - and it is about aggregation. I thought I'd show a couple ways to do aggregation using SSIS. The Aggregate Transformation in SSIS The Aggregate transform in SSIS is fast . I built an SSIS package (AggregateScripts.dtsx) with two Data Flow Tasks (Using the Aggregate Transform and Using a Script Component). Using the Aggregate Transform looks like this:...(read more)

    Read the article

  • SQL Server Configuration timeouts - and a workaround [SSIS]

    - by jamiet
    Ever since I started writing SSIS packages back in 2004 I have opted to store configurations in .dtsConfig (.i.e. XML) files rather than in a SQL Server table (aka SQL Server Configurations) however recently I inherited some packages that used SQL Server Configurations and thus had to immerse myself in their murky little world. To all the people that have ever gone onto the SSIS forum and asked questions about ambiguous behaviour of SQL Server Configurations I now say this... I feel your pain! The biggest problem I have had was in dealing with the change to the order in which configurations get applied that came about in SSIS 2008. Those changes are detailed on MSDN at SSIS Package Configurations however the pertinent bits are: As the utility loads and runs the package, events occur in the following order: The dtexec utility loads the package. The utility applies the configurations that were specified in the package at design time and in the order that is specified in the package. (The one exception to this is the Parent Package Variables configurations. The utility applies these configurations only once and later in the process.) The utility then applies any options that you specified on the command line. The utility then reloads the configurations that were specified in the package at design time and in the order specified in the package. (Again, the exception to this rule is the Parent Package Variables configurations). The utility uses any command-line options that were specified to reload the configurations. Therefore, different values might be reloaded from a different location. The utility applies the Parent Package Variable configurations. The utility runs the package. To understand how these steps differ from SSIS 2005 I recommend reading Doug Laudenschlager’s blog post Understand how SSIS package configurations are applied. The very nature of SQL Server Configurations means that the Connection String for the database holding the configuration values needs to be supplied from the command-line. Typically then the call to execute your package resembles this: dtexec /FILE Package.dtsx /SET "\Package.Connections[SSISConfigurations].Properties[ConnectionString]";"\"Data Source=SomeServer;Initial Catalog=SomeDB;Integrated Security=SSPI;\"", The problem then is that, as per the steps above, the package will (1) attempt to apply all configurations using the Connection String stored in the package for the "SSISConfigurations" Connection Manager before then (2) applying the Connection String from the command-line and then (3) apply the same configurations all over again. In the packages that I inherited that first attempt to apply the configurations would timeout (not unexpected); I had 8 SQL Server Configurations in the package and thus the package was waiting for 2 minutes until all the Configurations timed out (i.e. 15seconds per Configuration) - in a package that only executes for ~8seconds when it gets to do its actual work a delay of 2minutes was simply unacceptable. We had three options in how to deal with this: Get rid of the use of SQL Server configurations and use .dtsConfig files instead Edit the packages when they get deployed Change the timeout on the "SSISConfigurations" Connection Manager #1 was my preferred choice but, for reasons I explain below*, wasn't an option in this particular instance. #2 was discounted out of hand because it negates the point of using Configurations in the first place. This left us with #3 - change the timeout on the Connection Manager. This is done by going into the properties of the Connection Manager, opening the "All" tab and changing the Connect Timeout property to some suitable value (in the screenshot below I chose 2 seconds). This change meant that the attempts to apply the SQL Server configurations timed out in 16 seconds rather than two minutes; clearly this isn't an optimum solution but its certainly better than it was. So there you have it - if you are having problems with SQL Server configuration timeouts within SSIS try changing the timeout of the Connection Manager. Better still - don't bother using SQL Server Configuration in the first place. Even better - install RC0 of SQL Server 2012 to start leveraging SSIS parameters and leave the nasty old world of configurations behind you. @Jamiet * Basically, we are leveraging a SSIS execution/logging framework in which the client had invested a lot of resources and SQL Server Configurations are an integral part of that.

    Read the article

  • SSIS: Deploying OLAP cubes using C# script tasks and AMO

    - by DrJohn
    As part of the continuing series on Building dynamic OLAP data marts on-the-fly, this blog entry will focus on how to automate the deployment of OLAP cubes using SQL Server Integration Services (SSIS) and Analysis Services Management Objects (AMO). OLAP cube deployment is usually done using the Analysis Services Deployment Wizard. However, this option was dismissed for a variety of reasons. Firstly, invoking external processes from SSIS is fraught with problems as (a) it is not always possible to ensure SSIS waits for the external program to terminate; (b) we cannot log the outcome properly and (c) it is not always possible to control the server's configuration to ensure the executable works correctly. Another reason for rejecting the Deployment Wizard is that it requires the 'answers' to be written into four XML files. These XML files record the three things we need to change: the name of the server, the name of the OLAP database and the connection string to the data mart. Although it would be reasonably straight forward to change the content of the XML files programmatically, this adds another set of complication and level of obscurity to the overall process. When I first investigated the possibility of using C# to deploy a cube, I was surprised to find that there are no other blog entries about the topic. I can only assume everyone else is happy with the Deployment Wizard! SSIS "forgets" assembly references If you build your script task from scratch, you will have to remember how to overcome one of the major annoyances of working with SSIS script tasks: the forgetful nature of SSIS when it comes to assembly references. Basically, you can go through the process of adding an assembly reference using the Add Reference dialog, but when you close the script window, SSIS "forgets" the assembly reference so the script will not compile. After repeating the operation several times, you will find that SSIS only remembers the assembly reference when you specifically press the Save All icon in the script window. This problem is not unique to the AMO assembly and has certainly been a "feature" since SQL Server 2005, so I am not amazed it is still present in SQL Server 2008 R2! Sample Package So let's take a look at the sample SSIS package I have provided which can be downloaded from here: DeployOlapCubeExample.zip  Below is a screenshot after a successful run. Connection Managers The package has three connection managers: AsDatabaseDefinitionFile is a file connection manager pointing to the .asdatabase file you wish to deploy. Note that this can be found in the bin directory of you OLAP database project once you have clicked the "Build" button in Visual Studio TargetOlapServerCS is an Analysis Services connection manager which identifies both the deployment server and the target database name. SourceDataMart is an OLEDB connection manager pointing to the data mart which is to act as the source of data for your cube. This will be used to replace the connection string found in your .asdatabase file Once you have configured the connection managers, the sample should run and deploy your OLAP database in a few seconds. Of course, in a production environment, these connection managers would be associated with package configurations or set at runtime. When you run the sample, you should see that the script logs its activity to the output screen (see screenshot above). If you configure logging for the package, then these messages will also appear in your SSIS logging. Sample Code Walkthrough Next let's walk through the code. The first step is to parse the connection string provided by the TargetOlapServerCS connection manager and obtain the name of both the target OLAP server and also the name of the OLAP database. Note that the target database does not have to exist to be referenced in an AS connection manager, so I am using this as a convenient way to define both properties. We now connect to the server and check for the existence of the OLAP database. If it exists, we drop the database so we can re-deploy. svr.Connect(olapServerName); if (svr.Connected) { // Drop the OLAP database if it already exists Database db = svr.Databases.FindByName(olapDatabaseName); if (db != null) { db.Drop(); } // rest of script } Next we start building the XMLA command that will actually perform the deployment. Basically this is a small chuck of XML which we need to wrap around the large .asdatabase file generated by the Visual Studio build process. // Start generating the main part of the XMLA command XmlDocument xmlaCommand = new XmlDocument(); xmlaCommand.LoadXml(string.Format("<Batch Transaction='false' xmlns='http://schemas.microsoft.com/analysisservices/2003/engine'><Alter AllowCreate='true' ObjectExpansion='ExpandFull'><Object><DatabaseID>{0}</DatabaseID></Object><ObjectDefinition/></Alter></Batch>", olapDatabaseName));  Next we need to merge two XML files which we can do by simply using setting the InnerXml property of the ObjectDefinition node as follows: // load OLAP Database definition from .asdatabase file identified by connection manager XmlDocument olapCubeDef = new XmlDocument(); olapCubeDef.Load(Dts.Connections["AsDatabaseDefinitionFile"].ConnectionString); // merge the two XML files by obtain a reference to the ObjectDefinition node oaRootNode.InnerXml = olapCubeDef.InnerXml;   One hurdle I had to overcome was removing detritus from the .asdabase file left by the Visual Studio build. Through an iterative process, I found I needed to remove several nodes as they caused the deployment to fail. The XMLA error message read "Cannot set read-only node: CreatedTimestamp" or similar. In comparing the XMLA generated with by the Deployment Wizard with that generated by my code, these read-only nodes were missing, so clearly I just needed to strip them out. This was easily achieved using XPath to find the relevant XML nodes, of which I show one example below: foreach (XmlNode node in rootNode.SelectNodes("//ns1:CreatedTimestamp", nsManager)) { node.ParentNode.RemoveChild(node); } Now we need to change the database name in both the ID and Name nodes using code such as: XmlNode databaseID = xmlaCommand.SelectSingleNode("//ns1:Database/ns1:ID", nsManager); if (databaseID != null) databaseID.InnerText = olapDatabaseName; Finally we need to change the connection string to point at the relevant data mart. Again this is easily achieved using XPath to search for the relevant nodes and then replace the content of the node with the new name or connection string. XmlNode connectionStringNode = xmlaCommand.SelectSingleNode("//ns1:DataSources/ns1:DataSource/ns1:ConnectionString", nsManager); if (connectionStringNode != null) { connectionStringNode.InnerText = Dts.Connections["SourceDataMart"].ConnectionString; } Finally we need to perform the deployment using the Execute XMLA command and check the returned XmlaResultCollection for errors before setting the Dts.TaskResult. XmlaResultCollection oResults = svr.Execute(xmlaCommand.InnerXml);  // check for errors during deployment foreach (Microsoft.AnalysisServices.XmlaResult oResult in oResults) { foreach (Microsoft.AnalysisServices.XmlaMessage oMessage in oResult.Messages) { if ((oMessage.GetType().Name == "XmlaError")) { FireError(oMessage.Description); HadError = true; } } } If you are not familiar with XML programming, all this may all seem a bit daunting, but perceiver as the sample code is pretty short. If you would like the script to process the OLAP database, simply uncomment the lines in the vicinity of Process method. Of course, you can extend the script to perform your own custom processing and to even synchronize the database to a front-end server. Personally, I like to keep the deployment and processing separate as the code can become overly complex for support staff.If you want to know more, come see my session at the forthcoming SQLBits conference.

    Read the article

  • SQL SERVER – SSIS Look Up Component – Cache Mode – Notes from the Field #028

    - by Pinal Dave
    [Notes from Pinal]: Lots of people think that SSIS is all about arranging various operations together in one logical flow. Well, the understanding is absolutely correct, but the implementation of the same is not as easy as it seems. Similarly most of the people think lookup component is just component which does look up for additional information and does not pay much attention to it. Due to the same reason they do not pay attention to the same and eventually get very bad performance. Linchpin People are database coaches and wellness experts for a data driven world. In this 28th episode of the Notes from the Fields series database expert Tim Mitchell (partner at Linchpin People) shares very interesting conversation related to how to write a good lookup component with Cache Mode. In SQL Server Integration Services, the lookup component is one of the most frequently used tools for data validation and completion.  The lookup component is provided as a means to virtually join one set of data to another to validate and/or retrieve missing values.  Properly configured, it is reliable and reasonably fast. Among the many settings available on the lookup component, one of the most critical is the cache mode.  This selection will determine whether and how the distinct lookup values are cached during package execution.  It is critical to know how cache modes affect the result of the lookup and the performance of the package, as choosing the wrong setting can lead to poorly performing packages, and in some cases, incorrect results. Full Cache The full cache mode setting is the default cache mode selection in the SSIS lookup transformation.  Like the name implies, full cache mode will cause the lookup transformation to retrieve and store in SSIS cache the entire set of data from the specified lookup location.  As a result, the data flow in which the lookup transformation resides will not start processing any data buffers until all of the rows from the lookup query have been cached in SSIS. The most commonly used cache mode is the full cache setting, and for good reason.  The full cache setting has the most practical applications, and should be considered the go-to cache setting when dealing with an untested set of data. With a moderately sized set of reference data, a lookup transformation using full cache mode usually performs well.  Full cache mode does not require multiple round trips to the database, since the entire reference result set is cached prior to data flow execution. There are a few potential gotchas to be aware of when using full cache mode.  First, you can see some performance issues – memory pressure in particular – when using full cache mode against large sets of reference data.  If the table you use for the lookup is very large (either deep or wide, or perhaps both), there’s going to be a performance cost associated with retrieving and caching all of that data.  Also, keep in mind that when doing a lookup on character data, full cache mode will always do a case-sensitive (and in some cases, space-sensitive) string comparison even if your database is set to a case-insensitive collation.  This is because the in-memory lookup uses a .NET string comparison (which is case- and space-sensitive) as opposed to a database string comparison (which may be case sensitive, depending on collation).  There’s a relatively easy workaround in which you can use the UPPER() or LOWER() function in the pipeline data and the reference data to ensure that case differences do not impact the success of your lookup operation.  Again, neither of these present a reason to avoid full cache mode, but should be used to determine whether full cache mode should be used in a given situation. Full cache mode is ideally useful when one or all of the following conditions exist: The size of the reference data set is small to moderately sized The size of the pipeline data set (the data you are comparing to the lookup table) is large, is unknown at design time, or is unpredictable Each distinct key value(s) in the pipeline data set is expected to be found multiple times in that set of data Partial Cache When using the partial cache setting, lookup values will still be cached, but only as each distinct value is encountered in the data flow.  Initially, each distinct value will be retrieved individually from the specified source, and then cached.  To be clear, this is a row-by-row lookup for each distinct key value(s). This is a less frequently used cache setting because it addresses a narrower set of scenarios.  Because each distinct key value(s) combination requires a relational round trip to the lookup source, performance can be an issue, especially with a large pipeline data set to be compared to the lookup data set.  If you have, for example, a million records from your pipeline data source, you have the potential for doing a million lookup queries against your lookup data source (depending on the number of distinct values in the key column(s)).  Therefore, one has to be keenly aware of the expected row count and value distribution of the pipeline data to safely use partial cache mode. Using partial cache mode is ideally suited for the conditions below: The size of the data in the pipeline (more specifically, the number of distinct key column) is relatively small The size of the lookup data is too large to effectively store in cache The lookup source is well indexed to allow for fast retrieval of row-by-row values No Cache As you might guess, selecting no cache mode will not add any values to the lookup cache in SSIS.  As a result, every single row in the pipeline data set will require a query against the lookup source.  Since no data is cached, it is possible to save a small amount of overhead in SSIS memory in cases where key values are not reused.  In the real world, I don’t see a lot of use of the no cache setting, but I can imagine some edge cases where it might be useful. As such, it’s critical to know your data before choosing this option.  Obviously, performance will be an issue with anything other than small sets of data, as the no cache setting requires row-by-row processing of all of the data in the pipeline. I would recommend considering the no cache mode only when all of the below conditions are true: The reference data set is too large to reasonably be loaded into SSIS memory The pipeline data set is small and is not expected to grow There are expected to be very few or no duplicates of the key values(s) in the pipeline data set (i.e., there would be no benefit from caching these values) Conclusion The cache mode, an often-overlooked setting on the SSIS lookup component, represents an important design decision in your SSIS data flow.  Choosing the right lookup cache mode directly impacts the fidelity of your results and the performance of package execution.  Know how this selection impacts your ETL loads, and you’ll end up with more reliable, faster packages. If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Notes from the Field, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: SSIS

    Read the article

  • Content Catalog Live!

    - by marius.ciortea
    tweetmeme_url = 'http://blogs.oracle.com/javaone/2010/06/content_catalog_live.html'; Share .FBConnectButton_Small{background-position:-5px -232px !important;border-left:1px solid #1A356E;} .FBConnectButton_Text{margin-left:12px !important ;padding:2px 3px 3px !important;} The Oracle OpenWorld, JavaOne and Oracle Develop 2010 content catalog is live. You can peruse most of the almost 2,000 sessions available this year at OpenWorld, JavaOne and Oracle Develop, including session titles, abstracts, track info, and confirmed speakers.You can find the latest on JDK 7, deep dives on the JVM, REST, JavaFX, JSF, Enterprise Java, Seam, OSGi, HTTP, Swing, GWT, Groovy, JRuby, Unit Testing, Metro, Lift, Comet, jclouds, Hudson, Scala, [insert technology here], etc. To access the Content Catalog, just look under Tools on the right side of this page. You can tag content in the catalog so you--and others who do what you do, or think the way you think--can easily find this year's don't-miss sessions. Take a few minutes to look around, and start planning your most productive/informative/valuable JavaOne ever! Schedule Builder, where you can sign up for sessions, will be up in July.

    Read the article

  • Execute a SSIS package in Sync or Async mode from SQL Server 2012

    - by Davide Mauri
    Today I had to schedule a package stored in the shiny new SSIS Catalog store that can be enabled with SQL Server 2012. (http://msdn.microsoft.com/en-us/library/hh479588(v=SQL.110).aspx) Once your packages are stored here, they will be executed using the new stored procedures created for this purpose. This is the script that will get executed if you try to execute your packages right from management studio or through a SQL Server Agent job, will be similar to the following: Declare @execution_id bigint EXEC [SSISDB].[catalog].[create_execution] @package_name='my_package.dtsx', @execution_id=@execution_id OUTPUT, @folder_name=N'BI', @project_name=N'DWH', @use32bitruntime=False, @reference_id=Null Select @execution_id DECLARE @var0 smallint = 1 EXEC [SSISDB].[catalog].[set_execution_parameter_value] @execution_id,  @object_type=50, @parameter_name=N'LOGGING_LEVEL', @parameter_value=@var0 DECLARE @var1 bit = 0 EXEC [SSISDB].[catalog].[set_execution_parameter_value] @execution_id,  @object_type=50, @parameter_name=N'DUMP_ON_ERROR', @parameter_value=@var1 EXEC [SSISDB].[catalog].[start_execution] @execution_id GO The problem here is that the procedure will simply start the execution of the package and will return as soon as the package as been started…thus giving you the opportunity to execute packages asynchrously from your T-SQL code. This is just *great*, but what happens if I what to execute a package and WAIT for it to finish (and thus having a synchronous execution of it)? You have to be sure that you add the “SYNCHRONIZED” parameter to the package execution. Before the start_execution procedure: exec [SSISDB].[catalog].[set_execution_parameter_value] @execution_id,  @object_type=50, @parameter_name=N'SYNCHRONIZED', @parameter_value=1 And that’s it . PS From the RC0, the SYNCHRONIZED parameter is automatically added each time you schedule a package execution through the SQL Server Agent. If you’re using an external scheduler, just keep this post in mind .

    Read the article

  • New SSIS tool on Codeplex – SSIS Log Analyzer

    I stumbled across a new SSIS tool on Codeplex today, the SSIS Log Analyzer which was only released a few days ago. Whilst it is a beta release and currently only supports 2005 (2008 is promised) it looks quite interesting. It seems to be a fancy log viewer, but with some clever features and a nice looking front-end. I’ve only read the documentation so far, but it has graphs and a debug view that shows your package with the colour animations similar to when debugging in BIDS, and everyone knows, the way the pretty colours and numbers change is the best bit! I’ll quote some of the features for you here and then let you make your own mind up, is it useful in the real world? Option to analyze the logs manually by applying row and column filters over the log data or by using queries to specify more complex criterions. Automated Performance Analysis which provides a quick graphical look on which tasks spent most time during package execution. Rerun (debug) the entire sequence of events which happened during package execution showing the flow of control in graphical form, changes in runtime values for each task like execution duration etc. Support for Auto Analyzers to automatically find out issues and provide suggestions for problems which can be figured out with the help of SSIS logs and/or package. Option to analyze just log file or log and package together. Provides a lightweight environment to have a quick look at the package. Opening it in BIDS takes some time as being an authoring environment it does all sorts of validations resulting in some delay. See http://ssisloganalyzer.codeplex.com/  for more details.

    Read the article

  • New SSIS tool on Codeplex – SSIS Log Analyzer

    I stumbled across a new SSIS tool on Codeplex today, the SSIS Log Analyzer which was only released a few days ago. Whilst it is a beta release and currently only supports 2005 (2008 is promised) it looks quite interesting. It seems to be a fancy log viewer, but with some clever features and a nice looking front-end. I’ve only read the documentation so far, but it has graphs and a debug view that shows your package with the colour animations similar to when debugging in BIDS, and everyone knows, the way the pretty colours and numbers change is the best bit! I’ll quote some of the features for you here and then let you make your own mind up, is it useful in the real world? Option to analyze the logs manually by applying row and column filters over the log data or by using queries to specify more complex criterions. Automated Performance Analysis which provides a quick graphical look on which tasks spent most time during package execution. Rerun (debug) the entire sequence of events which happened during package execution showing the flow of control in graphical form, changes in runtime values for each task like execution duration etc. Support for Auto Analyzers to automatically find out issues and provide suggestions for problems which can be figured out with the help of SSIS logs and/or package. Option to analyze just log file or log and package together. Provides a lightweight environment to have a quick look at the package. Opening it in BIDS takes some time as being an authoring environment it does all sorts of validations resulting in some delay. See http://ssisloganalyzer.codeplex.com/  for more details.

    Read the article

  • adding custom SSIS transformation to visual studio toolbox fails

    - by ryangaraygay
    Just very recently I encountered an issue in deploying a custom SSIS component assembly which turns out to be a relative "no-brainer" error if only the clues were more straightforward. Basically after deploying the assembly I could not find my component listed in the "SSIS Data Flow Items" tab list.It turns out that the assembly containing the component just had missing or referenced the incorrect assemblies.I have outlined the steps I took that guided me on the right direction on this blog post of mine : adding custom SSIS transformation to visual studio toolbox fails 

    Read the article

  • SSIS Virtual Class

    - by ejohnson2010
    I recorded a Virtual SSIS Class with the good folks over at SSWUG and the first airing of the class will by May 15th. This is 100% online so you can do it on your own time and from anywhere. The class will run monthly and I will be available for questions through out. You get the following 12 sessions on SSIS, each about an hour. Session 1: The SSIS Basics Session 2: Control Flow Basics Session 3: Data Flow - Sources and Destinations Session 4: Data Flow - Transformations Session 5: Advanced Transformations...(read more)

    Read the article

  • Content Catalog for Oracle OpenWorld is Ready

    - by Rick Ramsey
    American Major League Baseball Umpire Jim Joyce made one of the worst calls in baseball history when he ruled Jason Donald safe at First in Wednesday's game between the Detroit Lions and the Cleveland Indians. The New York Times tells the story well. It was the 9th inning. There were two outs. And Detroit Tiger's pitcher Armando Galarraga had pitched a perfect game. Instead of becoming the 21st pitcher in Major League Baseball history to pitch a perfect game, Galarraga became the 10th pitcher in Major League Baseball history to ever lose a perfect game with two outs in the ninth inning. More insight from the New York Times here. You can avoid a similar mistake and its attendant death treats, hate mail, and self-loathing by studying the Content Catalog just released for Oracle Open World, Java One, and Oracle Develop conferences being held in San Francisco September 19-23. The Content Catalog displays all the available content related to the event, the venue, and the stream or track you're interested in. Additional filters are available to narrow down your results even more. It's simple to use and a big help. Give it a try. It'll spare you the fate of Jim Joyce. - Rick

    Read the article

  • SSIS Lookup component tuning tips

    - by jamiet
    Yesterday evening I attended a London meeting of the UK SQL Server User Group at Microsoft’s offices in London Victoria. As usual it was both a fun and informative evening and in particular there seemed to be a few questions arising about tuning the SSIS Lookup component; I rattled off some comments and figured it would be prudent to drop some of them into a dedicated blog post, hence the one you are reading right now. Scene setting A popular pattern in SSIS is to use a Lookup component to determine whether a record in the pipeline already exists in the intended destination table or not and I cover this pattern in my 2006 blog post Checking if a row exists and if it does, has it changed? (note to self: must rewrite that blog post for SSIS2008). Fundamentally the SSIS lookup component (when using FullCache option) sucks some data out of a database and holds it in memory so that it can be compared to data in the pipeline. One of the big benefits of using SSIS dataflows is that they process data one buffer at a time; that means that not all of the data from your source exists in the dataflow at the same time and is why a SSIS dataflow can process data volumes that far exceed the available memory. However, that only applies to data in the pipeline; for reasons that are hopefully obvious ALL of the data in the lookup set must exist in the memory cache for the duration of the dataflow’s execution which means that any memory used by the lookup cache will not be available to be used as a pipeline buffer. Moreover, there’s an obvious correlation between the amount of data in the lookup cache and the time it takes to charge that cache; the more data you have then the longer it will take to charge and the longer you have to wait until the dataflow actually starts to do anything. For these reasons your goal is simple: ensure that the lookup cache contains as little data as possible. General tips Here is a simple tick list you can follow in order to tune your lookups: Use a SQL statement to charge your cache, don’t just pick a table from the dropdown list made available to you. (Read why in SELECT *... or select from a dropdown in an OLE DB Source component?) Only pick the columns that you need, ignore everything else Make the database columns that your cache is populated from as narrow as possible. If a column is defined as VARCHAR(20) then SSIS will allocate 20 bytes for every value in that column – that is a big waste if the actual values are significantly less than 20 characters in length. Do you need DT_WSTR typed columns or will DT_STR suffice? DT_WSTR uses twice the amount of space to hold values that can be stored using a DT_STR so if you can use DT_STR, consider doing so. Same principle goes for the numerical datatypes DT_I2/DT_I4/DT_I8. Only populate the cache with data that you KNOW you will need. In other words, think about your WHERE clause! Thinking outside the box It is tempting to build a large monolithic dataflow that does many things, one of which is a Lookup. Often though you can make better use of your available resources by, well, mixing things up a little and here are a few ideas to get your creative juices flowing: There is no rule that says everything has to happen in a single dataflow. If you have some particularly resource intensive lookups then consider putting that lookup into a dataflow all of its own and using raw files to pass the pipeline data in and out of that dataflow. Know your data. If you think, for example, that the majority of your incoming rows will match with only a small subset of your lookup data then consider chaining multiple lookup components together; the first would use a FullCache containing that data subset and the remaining data that doesn’t find a match could be passed to a second lookup that perhaps uses a NoCache lookup thus negating the need to pull all of that least-used lookup data into memory. Do you need to process all of your incoming data all at once? If you can process different partitions of your data separately then you can partition your lookup cache as well. For example, if you are using a lookup to convert a location into a [LocationId] then why not process your data one region at a time? This will mean your lookup cache only has to contain data for the location that you are currently processing and with the ability of the Lookup in SSIS2008 and beyond to charge the cache using a dynamically built SQL statement you’ll be able to achieve it using the same dataflow and simply loop over it using a ForEach loop. Taking the previous data partitioning idea further … a dataflow can contain more than one data path so why not split your data using a conditional split component and, again, charge your lookup caches with only the data that they need for that partition. Lookups have two uses: to (1) find a matching row from the lookup set and (2) put attributes from that matching row into the pipeline. Ask yourself, do you need to do these two things at the same time? After all once you have the key column(s) from your lookup set then you can use that key to get the rest of attributes further downstream, perhaps even in another dataflow. Are you using the same lookup data set multiple times? If so, consider the file caching option in SSIS 2008 and beyond. Above all, experiment and be creative with different combinations. You may be surprised at what works. Final  thoughts If you want to know more about how the Lookup component differs in SSIS2008 from SSIS2005 then I have a dedicated blog post about that at Lookup component gets a makeover. I am on a mini-crusade at the moment to get a BULK MERGE feature into the database engine, the thinking being that if the database engine can quickly merge massive amounts of data in a similar manner to how it can insert massive amounts using BULK INSERT then that’s a lot of work that wouldn’t have to be done in the SSIS pipeline. If you think that is a good idea then go and vote for BULK MERGE on Connect. If you have any other tips to share then please stick them in the comments. Hope this helps! @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • SSIS - Access Denied with UNC paths - The file name is a device or contains invalid characters

    - by simonsabin
    I spent another day tearing my hair out yesterday trying to resolve an issue with SSIS packages runnning in SQLAgent (not got much left at the moment, maybe I should contact the SSIS team for a wig). My situation was that I am deploying packages to a development server, and to provide isolation I was running jobs with a proxy account that only had access to the development servers. Proxies are an awesome feature and mean that you should never have to "just run the job as sysadmin". The issue I was facing was that the job step was failing. The job step was a simple execution of the package.The following errors appeared in my log file. I always check the "Log step output in history" for a job step, this ensures you get all the output from the command that you run. I'll blog about this later. If looking at the output in sysdtslog90 then you will have an entry with datacode -1073573533 and error message File or directory "<filename>" represented by connection "<connection>" does not exist.  Not exactly helpful. If you get the output from the console then you will also get these errors. 0xC0202070 "The file name property is not valid. The file name is a device or contains invalid characters." 0xC001401E "specified in the connection was not valid." It appears this error is due to the use of a UNC path and the account runnnig the package not having access to all the folders in the path. Solution To solve this you need to ensure that the proxy account has access to ALL folders in the path you are accessing. To check this works, logon as the relevant proxy user, or run a command window as the specified user. Then try and do net use \\server\share and then do a dir for each folder in the path and check you have access. If these work and you still have the problem then you have some other problem, sorry. The following are posts on experts exchange that also discuss this,http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SSIS/Q_24056047.htmlhttp://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SSIS/Q_23968903.html This blog had a post about it being a 64 bit issue. That definitely wasn't the issue for me as I was on a 32 bit server http://blogs.perkinsconsulting.com/post/64-bit-SQL-Server-2005-SSIS-and-UNC-paths-Part-2.aspx  

    Read the article

  • Dynamic Unpivot : SSIS Nugget

    - by jamiet
    A question on the SSIS forum earlier today asked: I need to dynamically unpivot some set of columns in my source file. Every month there is one new column and its set of Values. I want to unpivot it without editing my SSIS packages that is deployed Let’s be clear about what we mean by Unpivot. It is a normalisation technique that basically converts columns into rows. By way of example it converts something like this: AccountCode Jan Feb Mar AC1 100.00 150.00 125.00 AC2 45.00 75.50 90.00 into something like this: AccountCode Month Amount AC1 Jan 100.00 AC1 Feb 150.00 AC1 Mar 125.00 AC2 Jan 45.00 AC2 Feb 75.50 AC2 Mar 90.00 The Unpivot transformation in SSIS is perfectly capable of carrying out the operation defined in this example however in the case outlined in the aforementioned forum thread the problem was a little bit different. I interpreted it to mean that the number of columns could change and in that scenario the Unpivot transformation (and indeed the SSIS dataflow in general) is rendered useless because it expects that the number of columns will not change from what is specified at design-time. There is a workaround however. Assuming all of the columns that CAN exist will appear at the end of the rows, we can (1) import all of the columns in the file as just a single column, (2) use a script component to loop over all the values in that “column” and (3) output each one as a column all of its own. Let’s go over that in a bit more detail.   I’ve prepared a data file that shows some data that we want to unpivot which shows some customers and their mythical shopping lists (it has column names in the first row): We use a Flat File Connection Manager to specify the format of our data file to SSIS: and a Flat File Source Adapter to put it into the dataflow (no need a for a screenshot of that one – its very basic). Notice that the values that we want to unpivot all exist in a column called [Groceries]. Now onto the script component where the real work goes on, although the code is pretty simple: Here I show a screenshot of this executing along with some data viewers. As you can see we have successfully pulled out all of the values into a row all of their own thus accomplishing the Dynamic Unpivot that the forum poster was after. If you want to run the demo for yourself then I have uploaded the demo package and source file up to my SkyDrive: http://cid-550f681dad532637.skydrive.live.com/self.aspx/Public/BlogShare/20100529/Dynamic%20Unpivot.zip Simply extract the two files into a folder, make sure the Connection Manager is pointing to the file, and execute! Hope this is useful. @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • Five things SSIS should drop

    - by jamiet
    There’s a current SQL Server meme going round entitled Five things SQL Server should drop and, whilst no-one tagged me to write anything, I couldn’t resist doing the same for SQL Server Integration Services. So, without further ado, here are five things that I think should be dropped from SSIS.Data source connectionsSeriously, does anyone use these? I know why they’re there. Someone sat in a meeting back in the early part of the last decade and said “Ooo, Reporting Services and Analysis Services have these things called Data Sources. If we used them in Integration Services then we’d have a really cool integration story.” Errr….no.Web Service TaskDitto. If you want to do anything useful against anything but the simplest of SOAP web services steer well clear of this peculiar SSIS additionActiveX Script TaskAnother task that I suspect has never seen the light of day in a SSIS package. It was billed as a way of running upgraded DTS2000 ActiveX scripts in SSIS – sounds good except for one thing. Anytime one of those scripts would try to talk to the DTS object model (which they all do – otherwise what’s the point) then they will error out. This one has always been a real head scratcher.Slow Changing Dimension wizardI suspect I may get some push back on this one but I’m mentioning it anyway. Some people like the SCD wizard; I am not one of those people! Everything that the SCD component does can easily be reproduced using other components and from a performance point of view its much more beneficial to use those alternatives.Multifile Connection ManagerImagining buying a house that came with a set of keys that didn’t open any of the doors. Sounds ridiculous right? How about a SSIS Connection Manager that doesn’t get used by any of the tasks or components. Ah, that’ll be the Multifile Connection Manager then!Comments are of course welcome. Diatribes are assumed :)@Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • Export all SSIS packages from msdb using Powershell

    - by jamiet
    Have you ever wanted to dump all the SSIS packages stored in msdb out to files? Of course you have, who wouldn’t? Right? Well, at least one person does because this was the subject of a thread (save all ssis packages to file) on the SSIS forum earlier today. Some of you may have already figured out a way of doing this but for those that haven’t here is a nifty little script that will do it for you and it uses our favourite jack-of-all tools … Powershell!! Imagine I have the following package folder structure on my Integration Services server (i.e. in [msdb]): There are two packages in there called “20110111 Chaining Expression components” & “Package”, I want to export those two packages into a folder structure that mirrors that in [msdb]. Here is the Powershell script that will do that:Param($SQLInstance = "localhost") #####Add all the SQL goodies (including Invoke-Sqlcmd)##### add-pssnapin sqlserverprovidersnapin100 -ErrorAction SilentlyContinue add-pssnapin sqlservercmdletsnapin100 -ErrorAction SilentlyContinue cls $Packages = Invoke-Sqlcmd -MaxCharLength 10000000 -ServerInstance $SQLInstance -Query "WITH cte AS ( SELECT cast(foldername as varchar(max)) as folderpath, folderid FROM msdb..sysssispackagefolders WHERE parentfolderid = '00000000-0000-0000-0000-000000000000' UNION ALL SELECT cast(c.folderpath + '\' + f.foldername as varchar(max)), f.folderid FROM msdb..sysssispackagefolders f INNER JOIN cte c ON c.folderid = f.parentfolderid ) SELECT c.folderpath,p.name,CAST(CAST(packagedata AS VARBINARY(MAX)) AS VARCHAR(MAX)) as pkg FROM cte c INNER JOIN msdb..sysssispackages p ON c.folderid = p.folderid WHERE c.folderpath NOT LIKE 'Data Collector%'" Foreach ($pkg in $Packages) { $pkgName = $Pkg.name $folderPath = $Pkg.folderpath $fullfolderPath = "c:\temp\$folderPath\" if(!(test-path -path $fullfolderPath)) { mkdir $fullfolderPath | Out-Null } $pkg.pkg | Out-File -Force -encoding ascii -FilePath "$fullfolderPath\$pkgName.dtsx" } To run it simply change the “localhost” parameter of the server you want to connect to either by editing the script or passing it in when the script is executed. It will create the folder structure in C:\Temp (which you can also easily change if you so wish – just edit the script accordingly). Here’s the folder structure that it created for me: Notice how it is a mirror of the folder structure in [msdb]. Hope this is useful! @Jamiet UPDATE: THis post prompted Chad Miller to write a post describing his Powershell add-in that utilises a SSIS API to do exporting of packages. Go take a read here: http://sev17.com/2011/02/importing-and-exporting-ssis-packages-using-powershell/

    Read the article

  • SSIS Expression Tester Tool

    - by Davide Mauri
    Thanks to my friend's Doug blog I’ve found a very nice tool made by fellow MVP Darren Green which really helps to make SSIS develoepers life easier: http://expressioneditor.codeplex.com/Wikipage?ProjectName=expressioneditor In brief the tool allow the testing of SSIS Expression so that one can evaluate and test them before using in SSIS packages. Cool and useful! Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • SSIS Expression Tester Tool

    - by Davide Mauri
    Thanks to my friend's Doug blog I’ve found a very nice tool made by fellow MVP Darren Green which really helps to make SSIS develoepers life easier: http://expressioneditor.codeplex.com/Wikipage?ProjectName=expressioneditor In brief the tool allow the testing of SSIS Expression so that one can evaluate and test them before using in SSIS packages. Cool and useful! Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • Announcing Four Weeks of SSIS MicroTraining!

    - by andyleonard
    For the next four Tuesdays – 29 Nov, 6 Dec, 13 Dec, and 20 Dec – I will deliver a 30 – 45 minute presentation beginning at 11:00 AM EST on Google+ . Please note Google+ limits attendance to the first ten people who join the Hangout and I have no control over who gets in. The topics will be: 29 Nov – “I See a Control Flow. Now What?” (Creating Your First SSIS Package) 6 Dec – The Incremental Load SSIS Design Pattern 13 Dec – Flat File Fu 20 Dec – SSIS Frameworks Magic Details 29 Nov – “I See a Control...(read more)

    Read the article

  • Required Parameters [SSIS Denali]

    - by jamiet
    SQL Server Integration Services (SSIS) in its 2005 and 2008 incarnations expects you to set a property values within your package at runtime using Configurations. SSIS developers tend to have rather a lot of issues with SSIS configurations; in this blog post I am going to highlight one of those problems and how it has been alleviated in SQL Server code-named Denali.   A configuration is a property path/value pair that exists outside of a package, typically within SQL Server or in a collection of one or more configurations in a file called a .dtsConfig file. Within the package one defines a pointer to a configuration that says to the package “When you execute, go and get a configuration value from this location” and if all goes well the package will fetch that configuration value as it starts to execute and you will see something like the following in your output log: Information: 0x40016041 at Package: The package is attempting to configure from the XML file "C:\Configs\MyConfig.dtsConfig". Unfortunately things DON’T always go well, perhaps the .dtsConfig file is unreachable or the name of the SQL Sever holding the configuration value has been defined incorrectly – any one of a number of things can go wrong. In this circumstance you might see something like the following in your log output instead: Warning: 0x80012014 at Package: The configuration file "C:\Configs\MyConfig.dtsConfig" cannot be found. Check the directory and file name. The problem that I want to draw attention to here though is that your package will ignore the fact it can’t find the configuration and executes anyway. This is really really bad because the package will not be doing what it is supposed to do and worse, if you have not isolated your environments you might not even know about it. Can you imagine a package executing for months and all the while inserting data into the wrong server? Sounds ridiculous but I have absolutely seen this happen and the root cause was that no-one picked up on configuration warnings like the one above. Happily in SSIS code-named Denali this problem has gone away as configurations have been replaced with parameters. Each parameter has a property called ‘Required’: Any parameter with Required=True must have a value passed to it when the package executes. Any attempt to execute the package will result in an error. Here we see that error when attempting to execute using the SSMS UI: and similarly when executing using T-SQL: Error is: Msg 27184, Level 16, State 1, Procedure prepare_execution, Line 112 In order to execute this package, you need to specify values for the required parameters.   As you can see, SSIS code-named Denali has mechanisms built-in to prevent the problem I described at the top of this blog post. Specifying a Parameter required means that any packages in that project cannot execute until a value for the parameter has been supplied. This is a very good thing. I am loathe to make recommendations so early in the development cycle but right now I’m thinking that all Project Parameters should have Required=True, certainly any that are used to define external locations should be anyway. @Jamiet

    Read the article

  • Have SSIS' differing type systems ever caused you problems?

    - by jamiet
    One thing that has always infuriated me about SSIS is the fact that every package has three different type systems; to give you an idea of what I am talking about consider the following: The SSIS dataflow's type system is made up of types called DT_*  (e.g. DT_STR, DT_I4) The SSIS variable type system is based on .Net datatypes (e.g. String, Int32) The types available for Execute SQL Task's parameters are based on something else - I don't exactly know what (e.g. VARCHAR, LONG) Speaking euphemistically ... this is not an optimum situation (were I not speaking euphemistically I would be a lot ruder) and hence I have submitted a suggestion to Connect at [SSIS] Consolidate three type systems into one requesting that it be remedied. This accompanying blog post is not however a request for votes (though that would be nice); the reason is actually subtler than that. Let me explain. I have been submitting bugs and suggestions pertaining to SSIS for years and have, so far, submitted over 200 Connect items. If that experience has taught me anything it is this - Connect items are not generally actioned because they are considered "nice to have". No, SSIS Connect items get actioned because they cause customers grief and if I am perfectly honest I must admit that, other than being a bit gnarly, SSIS' three type system architecture has never knowingly caused me any significant problems. The reason for this blog post is to ask if any reader out there has ever encountered any problems on account of SSIS' three type systems or have you, like me, never found them to be a problem? Errors or performance degredation caused by implicit type conversions would, I believe, present a strong case for getting this situation remedied in a future version of SSIS so if you HAVE encountered such problems I would encourage you to leave a comment on the Connect submission accordingly. Let me know in the comments too - I would be interested to hear others' opinions on this. @Jamiet

    Read the article

  • SSIS - Connect to Oracle on a 64-bit machine (Updated for SSIS 2008 R2)

    - by jorg
    We recently had a few customers where a connection to Oracle on a 64 bit machine was necessary. A quick search on the internet showed that this could be a big problem. I found all kind of blog and forum posts of developers complaining about this. A lot of developers will recognize the following error message: Test connection failed because of an error in initializing provider. Oracle client and networking components were not found. These components are supplied by Oracle Corporation and are part of the Oracle Version 7.3.3 or later client software installation. Provider is unable to function until these components are installed. After a lot of searching, trying and debugging I think I found the right way to do it! Problems Because BIDS is a 32 bit application, as well on 32 as on 64 bit machines, it cannot see the 64 bit driver for Oracle. Because of this, connecting to Oracle from BIDS on a 64 bit machine will never work when you install the 64 bit Oracle client. Another problem is the "Microsoft Provider for Oracle", this driver only exists in a 32 bit version and Microsoft has no plans to create a 64 bit one in the near future. The last problem I know of is in the Oracle client itself, it seems that a connection will never work with the instant client, so always use the full client. There are also a lot of problems with the 10G client, one of it is the fact that this driver can't handle the "(x86)" in the path of SQL Server. So using the 10G client is no option! Solution Download the Oracle 11G full client. Install the 32 AND the 64 bit version of the 11G full client (Installation Type: Administrator) and reboot the server afterwards. The 32 bit version is needed for development from BIDS with is 32 bit, the 64 bit version is needed for production with the SQLAgent, which is 64 bit. Configure the Oracle clients (both 32 and 64 bits) by editing  the files tnsnames.ora and sqlnet.ora. Try to do this with an Oracle DBA or, even better, let him/her do this. Use the "Oracle provider for OLE DB" from SSIS, don't use the "Microsoft Provider for Oracle" because a 64 bit version of it does not exist. Schedule your packages with the SQLAgent. Background information Visual Studio (BI Dev Studio)is a 32bit application. SQL Server Management Studio is a 32bit application. dtexecui.exe is a 32bit application. dtexec.exe has both 32bit and 64bit versions. There are x64 and x86 versions of the Oracle provider available. SQLAgent is a 64bit process. My advice to BI consultants is to get an Oracle DBA or professional for the installation and configuration of the 2 full clients (32 and 64 bit). Tell the DBA to download the biggest client available, this way you are sure that they pick the right one ;-) Testing if the clients have been installed and configured in the right way can be done with Windows ODBC Data Source Administrator: Start... Programs... Administrative tools... Data Sources (ODBC) ADITIONAL STEPS FOR SSIS 2008 R2 It seems that, unfortunately, some additional steps are necessary for SQL Server 2008 R2 installations: 1. Open REGEDIT (Start… Run… REGEDIT) on the server and search for the following entry (for the 32 bits driver): HKEY_LOCAL_MACHINE\Software\Microsoft\MSDTC\MTxOCI Make sure the following values are entered: 2. Next, search for (for the 64 bits driver): HKEY_LOCAL_MACHINE\Software\Wow6432Node\Microsoft\MSDTC\MTxOCI Make sure the same values as above are entered. 3. Reboot your server.

    Read the article

  • SSIS code smell – Unused columns in the dataflow

    - by jamiet
    A code smell is defined on Wikipedia as being a “symptom in the source code of a program that possibly indicates a deeper problem”. It’s a term commonly used by our code-writing brethren to describe sub-optimal code but I think the term can be applied equally well to SSIS packages too as I shall now explain One of my pet hates about SSIS development is packages that throw warnings of the form: The output column "ColumnName" (1358) on output "OLE DB Source Output" (1289) and component "OLE_SRC Name" (1279) is not subsequently used in the Data Flow task. Removing this unused output column can increase Data Flow task performance.  The warning is fairly self-explanatory – any column that appears in the data flow but doesn’t get used will throw this warning when the data flow is executed. Its not the negligible performance degradation that they cause that bothers me though, it’s the clutter that they cause in your log file/table. Take a look at the following screenshot if you don’t believe me: There are 231409 such warnings in the system that I took this screenshot from, that is 231409 log records that should not be there. The most infuriating thing about this warning is that it is so easily avoidable; eliminating such columns is a very quick and easy thing to do in the SSIS Designer. The only problem I see is that the warnings don’t occur until you execute the package – it would be preferable for the designer to have an unobtrusive way of informing you of them as well. Anyway, I digress… I consider such warnings to be a code smell because, to me, they’re symptomatic of a lack of due care and attention; a lack of developer discipline if you will. What other code smells can you think of when building SSIS packages? If I get a good list in the comments maybe I’ll compile them into a later blog post. @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • Learn SSIS in London 12-14 Sep 2012!

    - by andyleonard
    My friends at TechniTrain , the students, and I had a blast during the March 2012 London delivery of From Zero To SSIS ! We have decided to do it again in September 2012 with my new Learning SSIS 2012 3-day course ! Please find a course outline here . It is difficult to list everything I cover in the course, but the outline hits the high spots. This material grew out of my experiences serving as a consultant on short-term engagements and as a manager (and enterprise ETL architect) for a team of forty...(read more)

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >