Search Results

Search found 487 results on 20 pages for 'etl instrumentation'.

Page 1/20 | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • SQL SERVER – Introduction to Adaptive ETL Tool – How adaptive is your ETL?

    - by pinaldave
    I am often reminded by the fact that BI/data warehousing infrastructure is very brittle and not very adaptive to change. There are lots of basic use cases where data needs to be frequently loaded into SQL Server or another database. What I have found is that as long as the sources and targets stay the same, SSIS or any other ETL tool for that matter does a pretty good job handling these types of scenarios. But what happens when you are faced with more challenging scenarios, where the data formats and possibly the data types of the source data are changing from customer to customer?  Let’s examine a real life situation where a health management company receives claims data from their customers in various source formats. Even though this company supplied all their customers with the same claims forms, they ended up building one-off ETL applications to process the claims for each customer. Why, you ask? Well, it turned out that the claims data from various regional hospitals they needed to process had slightly different data formats, e.g. “integer” versus “string” data field definitions.  Moreover the data itself was represented with slight nuances, e.g. “0001124” or “1124” or “0000001124” to represent a particular account number, which forced them, as I eluded above, to build new ETL processes for each customer in order to overcome the inconsistencies in the various claims forms.  As a result, they experienced a lot of redundancy in these ETL processes and recognized quickly that their system would become more difficult to maintain over time. So imagine for a moment that you could use an ETL tool that helps you abstract the data formats so that your ETL transformation process becomes more reusable. Imagine that one claims form represents a data item as a string – acc_no(varchar) – while a second claims form represents the same data item as an integer – account_no(integer). This would break your traditional ETL process as the data mappings are hard-wired.  But in a world of abstracted definitions, all you need to do is create parallel data mappings to a common data representation used within your ETL application; that is, map both external data fields to a common attribute whose name and type remain unchanged within the application. acc_no(varchar) is mapped to account_number(integer) expressor Studio first claim form schema mapping account_no(integer) is also mapped to account_number(integer) expressor Studio second claim form schema mapping All the data processing logic that follows manipulates the data as an integer value named account_number. Well, these are the kind of problems that that the expressor data integration solution automates for you.  I’ve been following them since last year and encourage you to check them out by downloading their free expressor Studio ETL software. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Business Intelligence, Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: ETL, SSIS

    Read the article

  • SQL SERVER – 4 Tips for ETL Software IDE Developers

    - by pinaldave
    In a previous blog, I introduced the notion of Semantic Types. To an end-user, a seamlessly integrated semantic typing engine significantly increases the ease of use of an ETL IDE (integrated development environment, or developer studio). This led me to think about other ease-of-use issues I have encountered while building ETL applications. When I get stumped while programming, I find myself asking the variations on these questions: “How do I…?” “Now what?” “Why isn’t this working?” “Why do I have to redo the work I just did?” It seems to me that a good ETL IDE will anticipate these questions and seek to answer them before they are even asked. So here are my tips to help software vendors build developer IDEs that actually make development easier. How do I…? While developing an ETL application, have you ever asked yourself: “How do I set up the connection to my SQL Server database?”,“How do I import my table definitions from Access?”, etc. An easy answer might be “read the manual” but sometimes product manuals are not robust or easily accessible. So, integrating robust how-to instructions directly into your ETLstudio would help users get the information they need at the time they need it. Now what? IDEs in general know where you last clicked or performed an action using an input device such as a keyboard; so they should be able to reasonably predict the design context you are in and suggest the next steps accordingly. Context-sensitive suggestions based on the state of the user’s work will help users move forward in ETL application development. Why isn’t this working? Or why do I have to wait till I compile to be told about a critical design issue? If an ETL IDE is smart enough to signal to users what in their design structures is left to be completed or has been completed incorrectly, then the developer can spend much less time in the designàcompileàerror-correct loop. Just-in-time validation helps users detect and correct programming errors earlier in the ETL development life cycle. Why do I have to redo the work I just did? In ETL development, schemas, transformation rules, connectivity objects, etc., can be reused in various situations. Using mouse-clicks to build and manage libraries of reusable design objects implies that the application development effort should decrease over time and as the library acquires more objects. I met a great company at SQL Pass that is trying to address many of these usability issues. Check them out at www.expressor-software.com. What other ease-of-use suggestions do you have for ETL software vendors? Please post your valuable comments. ?Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Best Practices, Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: ETL

    Read the article

  • ETL Software Research Question

    - by WernerCD
    Where I work, we use an in-house ETL solution that's homegrown and has been around for 5-10 years. I'm still new to my data analysis job, but I was wondering about the ETL tools that are out there. This is a new area for me. My situation, and job, is basically digging in a set of databases (DB2, SQL2005, Citrix, Ancient Cobol Database with a SQL Wrapper on top, MySQL, etc). Gather the desired information. combine the different datasets into one set. output into a file of choice (CSV, Tab Separated, Pipe Separated, XLS, etc). FTP to customer. I guess what my real question is, given my job, what are some good ETL suites that I can look at and compare to my in-house tools? This is more to research some other options. Ultimately, I'd either suggest a new solution or get options/ideas to improve our current app.

    Read the article

  • Designing a Content-Based ETL Process with .NET and SFDC

    - by Patrick
    As my firm makes the transition to using SFDC as our main operational system, we've spun together a couple of SFDC portals where we can post customer-specific documents to be viewed at will. As such, we've had the need for pseudo-ETL applications to be implemented that are able to extract metadata from the documents our analysts generate internally (most are industry-standard PDFs, XML, or MS Office formats) and place in networked "queue" folders. From there, our applications scoop of the queued documents and upload them to the appropriate SFDC CRM Content Library along with some select pieces of metadata. I've mostly used DbAmp to broker communication with SFDC (DbAmp is a Linked Server provider that allows you to use SQL conventions to interact with your SFDC Org data). I've been able to create [console] applications in C# that work pretty well, and they're usually structured something like this: static void Main() { // Load parameters from app.config. // Get documents from queue. var files = someInterface.GetFiles(someFilterOrRegexPattern); foreach (var file in files) { // Extract metadata from the file. // Validate some attributes of the file; add any validation errors to an in-memory // structure (e.g. List<ValidationErrors>). if (isValid) { var fileData = File.ReadAllBytes(file); // Upload using some wrapper for an ORM or DAL someInterface.Upload(fileData, meta.Param1, meta.Param2, ...); } else { // Bounce the file } } // Report any validation errors (via message bus or SMTP or some such). } And that's pretty much it. Most of the time I wrap all these operations in a "Worker" class that takes the needed interfaces as constructor parameters. This approach has worked reasonably well, but I just get this feeling in my gut that there's something awful about it and would love some feedback. Is writing an ETL process as a C# Console app a bad idea? I'm also wondering if there are some design patterns that would be useful in this scenario that I'm clearly overlooking. Thanks in advance!

    Read the article

  • Designing Content-Based ETL Process with .NET and SFDC

    - by Patrick
    As my firm makes the transition to using SFDC as our main operational system, we've spun together a couple of SFDC portals where we can post customer-specific documents to be viewed at will. As such, we've had the need for pseudo-ETL applications to be implemented that are able to extract metadata from the documents our analysts generate internally (most are industry-standard PDFs, XML, or MS Office formats) and place in networked "queue" folders. From there, our applications scoop of the queued documents and upload them to the appropriate SFDC CRM Content Library along with some select pieces of metadata. I've mostly used DbAmp to broker communication with SFDC (DbAmp is a Linked Server provider that allows you to use SQL conventions to interact with your SFDC Org data). I've been able to create [console] applications in C# that work pretty well, and they're usually structured something like this: static void Main() { // Load parameters from app.config. // Get documents from queue. var files = someInterface.GetFiles(someFilterOrRegexPattern); foreach (var file in files) { // Extract metadata from the file. // Validate some attributes of the file; add any validation errors to an in-memory // structure (e.g. List<ValidationErrors>). if (isValid) { // Upload using some wrapper for an ORM an someInterface.Upload(meta.Param1, meta.Param2, ...); } else { // Bounce the file } } // Report any validation errors (via message bus or SMTP or some such). } And that's pretty much it. Most of the time I wrap all these operations in a "Worker" class that takes the needed interfaces as constructor parameters. This approach has worked reasonably well, but I just get this feeling in my gut that there's something awful about it and would love some feedback. Is writing an ETL process as a C# Console app a bad idea? I'm also wondering if there are some design patterns that would be useful in this scenario that I'm clearly overlooking. Thanks in advance!

    Read the article

  • TimeStamp and mini-ETL (extract, transform, load)

    - by Tomaz.tsql
    Short example how to use Timestamp for a mini ETL process of your data. example below is following: Table_1 is production table on server1 Table_2 is datawarehouse table on server2 where datawarehouse is located Every day data are extracted, transformed and loaded to dataware house for further off-line usage and data analysis and business decision support. 1. Creating the environment if object_id ('table_1') is not null drop table table_1; go if object_id ('table_2') is not null drop...(read more)

    Read the article

  • How to automate a monitoring system for ETL runs

    - by Jeffrey McDaniel
    Upon completion of the Primavera ETL process there are a few ways to determine if the process finished successfully.  First, in the <installation directory>\log folder,  there is a staretlprocess.log and staretl.html files. These files will give the output results of the ETL run. The staretl.html file will give a detailed summary of each step of the process, its run time, and its status. The .log file, based on the logging level set in the Configuration tool, can give extensive information about the ETL process. The log file can be used as a validation for process completion.  To automate the monitoring of these log files, perform the following steps: 1. Write a custom application to parse through the log file and search for [ERROR] . In most cases,  a major [ERROR] could cause the ETL process to fail. Searching the log and finding this value is worthy of an alert. 2. Determine the total number of steps in the ETL process, and validate that the log file recorded and entry for the final step.  For example validate that your log file contains an entry for Step 39/39 (could be different based on the version you are running). If there is no Step 39/39, then either the process is taking longer than expected or it didn't make it to the end.  Either way this would be a good cause for an alert. 3. Check the last line in the log file. The last line of the log file should contain an indication that the ETL run completed successfully. For example, the last line of a log file will say (results could be different based on Reporting Database versions):   [INFO] (Message) Finished Writing Report 4. You could write an Ant script to execute the ETL process and have it set to - failonerror="true" - and from there send results to an external tool to monitor the jobs, send to email, or send to database. With each ETL run, the log file appends to the existing log file by default. Because of this behavior, I would recommend renaming the existing log files before running a new ETL process. By doing this,  only log entries for the currently running ETL process is recorded in the new log files. Based on these log entries, alerts can be setup to notify the administrator or DBA. Another way to determine if the ETL process has completed successfully is to monitor the etl_processmaster table.  Depending on the Reporting Database version this could be in the Stage or Star databases. As of Reporting Database 2.2 and higher this would be in the Star database.  The etl_processmaster table records entries for the ETL run along with a Start and Finish time.  If the ETl process has failed the Finish date should be null. This table can be queried at a time when ETL process is expected to be finished and if null send an alert.  These are just some options. There are additional ways this can be accomplished based around these two areas - log files or database. Here is an additional query to gather more information about your ETL run (connect as Staruser): SELECT SYSDATE,test_script,decode(loc, 0, PROCESSNAME, trim(SUBSTR(PROCESSNAME, loc+1))) PROCESSNAME ,duration duration from ( select (e.endtime - b.starttime) * 1440 duration, to_char(b.starttime, 'hh24:mi:ss') starttime, to_char(e.endtime, 'hh24:mi:ss') endtime,  b.PROCESSNAME, instr(b.PROCESSNAME, ']') loc, b.infotype test_script from ( select processid, infodate starttime, PROCESSNAME, INFOMSG, INFOTYPE from etl_processinfo  where processid = (select max(PROCESSID) from etl_processinfo) and infotype = 'BEGIN' ) b  inner Join ( select processid, infodate endtime, PROCESSNAME, INFOMSG, INFOTYPE from etl_processinfo  where processid = (select max(PROCESSID) from etl_processinfo) and infotype = 'END' ) e on b.processid = e.processid  and b.PROCESSNAME = e.PROCESSNAME order by b.starttime)

    Read the article

  • SSIS Design Patterns Training in London 8-11 Sep!

    - by andyleonard
    A few seats remain for my course SQL Server Integration Services 2012 Design Patterns to be delivered in London 8-11 Sep 2014. Register today to learn more about: New features in SSIS 2012 and 2014 Advanced patterns for loading data warehouses Error handling The (new) Project Deployment Model Scripting in SSIS The (new) SSIS Catalog Designing custom SSIS tasks Executing, managing, monitoring, and administering SSIS in the enterprise Business Intelligence Markup Language (Biml) BimlScript ETL Instrumentation...(read more)

    Read the article

  • Great Discussion of ETL and ELT Tooling in TDWI Linkedin Group

    - by antonio romero
    All, There’s a great discussion of ETL and ELT tooling going on in the official TDWI Linkedin group, under the heading “How Sustainable is SQL for ETL?” It delves into a wide range of topics: The pros and cons of handcoding vs. using tools to design ETL ETL (with separate transformation engines) vs. ELT (transforms in the database) and push-down solutions The future of ETL and data warehousing products A number of community members (of varying affiliations) have kept this conversation going for many months, and are learning from each other as they go. So check it out… Also, while you’re on Linkedin, join the Oracle ETL/Data Integration Linkedin group (for both OWB and ODI users), which recently passed the 2000 member mark.

    Read the article

  • SQL SERVER – Sharing your ETL Resources Across Applications with Ease

    - by pinaldave
    Frequently an organization will find that the same resources are used in multiple ETL applications, for example, the same database, general purpose processing logic, or file system locations.  Creating an easy way to reuse these resources across multiple applications would increase efficiency and reduce errors.  Moreover, not every ETL developer has the same skill set, and it is likely that one developer will be more adept at writing code while another is more comfortable configuring database connections.  Real productivity gains will come when these developers are able to work independently while still making their work available to others assigned to the same project.  These are the benefits of a centralized version control system. Of course, most version control systems could be used to store and serve files, but the real need is to store and serve entire ETL applications so that each developer’s ongoing work can immediately benefit from another developer’s completed work.  In other words, the version control system needs to be tightly integrated with the tools used to develop the ETL application. The following screen shot shows such a tool. Desktop ETL tool that tightly integrates with a central version control system Developers can checkout or commit entire projects or just a single artifact.  Each artifact may be managed independently so if you need to go back to an earlier version of one artifact, changes you may have made to other artifacts are not lost.  By being tightly integrated into the graphical environment used to create and edit the project artifacts, it is extremely easy and straight-forward to move your files to and from the version control system and there is no dependency on another vendor’s version control system.  The built in version control system is optimized for managing the artifacts of ETL applications. It is equally important that the version control system supports all of the actions one typically performs such as rollbacks, locking and unlocking of files, and the ability to resolve conflicts.  Note that this particular ETL tool also has the capability to switch back and forth between multiple version control systems. It also needs to be easy to determine the status of an artifact.  Not just that it has been committed or modified, but when and by whom.  Generally you must query the version control system for this information, but having it displayed within the development environment is more desirable. Who’s ETL tool works in this fashion?  Last month I mentioned the data integration solution offered by expressor software.  The version control features I described in this post are all available in their just released expressor 3.1 Standard Edition through the integration of their expressor Studio development environment with a centralized metadata repository and version control system. You can download their Studio application, which is free, or evaluate the full Standard Edition on your own hardware.  It may be worth your time. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Instrumentation class in the Android API.

    - by Riyas
    Hi All, I have question on the Android API. Android API provides a class called "Instrumentation" class. What is the use of this class? Is the Instrumentation class be used only together with Junit for unit testing. Can Junit framework can be used to test the methods of the Android API without using the Instrumentation class. Since Junit package has been included in the Android package, I hope we dont need to use install separately for unit testing. I would appreciate if you could provide me the information as i can't find these clear information anywhere on the Web. If we use Junit test framework to test the Android API, can we have test results in the UI format rather than test format.? Thanks a lot. Apprecite your time. Regards, Riyas

    Read the article

  • .NET Single Line Logging (ala Trace.Write/WriteLine) using Instrumentation.Logging

    - by KnownColor
    Hello Everyone, My question is whether it is possible to get line/multiline (very unsure of correct term for this) behaviour of the Trace.Write and Trace.WriteLine methods but using the Microsoft Instrumentation Logging framework in .NET 2.0. Desired Output Hello World! Oh Hai. What I Currently Have Trace.Write("Hello "); Trace.WriteLine("World!"); Trace.Write("Oh Hai."); I would prefer to use instrumentation to log rather than writing to a log file using Debug.Trace. EDIT: By Instrumentation Logging I mean using a 'loggingConfiguration' block in my App.config and writing Log Entries using using Microsoft.Practices.EnterpriseLibrary.Logging.Logger.Write(LogEntry logEntry); Microsoft.Practices.EnterpriseLibrary.Logging.Configuration.FlatFileTraceListenerData, Microsoft.Practices.EnterpriseLibrary.Logging, Version=2.0.0.0 for example. Ta, KnownColor

    Read the article

  • GUI testing with Instrumentation in Android

    - by Sara
    I want to test my Android applications UI, with keyevents and pressed buttons and so on. I've read som documentation that Instrumentation would be able to use for this purpose. Anyone with expericence with using Instrumentation for UI testing?

    Read the article

  • Alternatives to Java bytecode instrumentation

    - by Rafael Regis
    I'm starting a project that will have to instrument java applications for coverage purposes (definition-usage of variables, etc). It has to add trace statements and some logic to the application and maybe remove statements. I have searched for ways of instrument Java code and what I always find is about bytecode instrumentation. My question is: It's the only way to instrument Java applications? There is any other way to do that? What are the advantages of bytecode instrumentation over the others? I'll probably use the bytecode solution, but I want to know what are the problems with the other approaches (if any) to decide precisely. Thanks!

    Read the article

  • ETL Tools and Build Tools

    - by Ngu Soon Hui
    I have familiarities with software automated build tools ( such as Automated Build Studio). Now I am looking at ETL tools. The one thing crosses my mind is that, I can do anything I can do in ETL tools by using a software build tool. ETL tools are tailored for data loading and manipulation for which a lot of scripts are needed in order to do the job. Software build tool, on the other hand, is versatile enough to do any jobs, including writing scripts to extract, transform and load any data from any format into any format. Am I right?

    Read the article

  • Loop Control within a DataflowTask in ETL

    - by Ben
    Hi, Being fairly new to SSIS and the ETL process, I was wondering if there is anyway to loop though a record set within a DataFlowTask and pass each row (deriving parameters from the row) into a Stored Procedure (the next step in the ETL phase). Once i have passed the row into the stored procedure, I want the results from each iteration to be written to a Table. Does anyone know how to do this? Thanks.

    Read the article

  • What is instrumentation?

    - by Jon Seigel
    I've heard this term used a lot in the same context as logging, but I can't seem to find a clear definition of what it actually is. Is it simply a more general class of logging/monitoring tools and activities? Please provide sample code/scenarios when/how instrumentation should be used.

    Read the article

  • Oracle Warehouse Builder és Enterprise ETL

    - by Fekete Zoltán
    Friss és ropogós az adatlap!!! Fogyasszátok egészséggel: ODI Enterprise Edition: Warehouse Builder Enterprise ETL white paper. A jó hír: minden megvásárolt Oracle Database-hez ingyenese használható az Oracle Warehouse Builder alap (core) funkcionalitása. Mi is az az OWB core funkcionalitás, és mit használhatunk az opciókban? Az Enterprise ETL funkcionalitás az Oracle Data Integrator Enterprise Edition licensz részeként érheto el az OWB-hez. Azok a funkciók, amik csak az ODI EE licensszel érhetok el (a korábbi OWB Enterprise ETL opció is ennek a része) megtekinthetok itt is a szöveg alján. Ezek: - Transportable ETL modules, multiple configurations, and pluggable mappings - Operators for pluggable mapping, pluggable mapping input signature, pluggable mapping output signature - Design Environment Support for RAC - Metadata change propagation - Schedulable Mappings and Process Flows - Slowing Changing Dimensions (SCD) Type 2 and 3 - XML Files as a target - Target load ordering - Seeded spatial and streams transformations - Process Flow Activity templates - Process Flow variables support - Process Flow looping activities such as For Loop and While Loop - Process Flow Route and Notification activities - Metadata lineage and impact analysis - Metadata Extensibility - Deployment to Discoverer EUL - Deployment to Oracle BI Beans catalog Tehát ha komolyabb környezetben szeretném használni az OWB-t, több környezetbe deployálni, stb, akkor szükség van az ODI EE licenszre is. ODI Enterprise Edition: Warehouse Builder Enterprise ETL white paper.

    Read the article

  • Enterprise Instrumentation: The 'sessionName' parameter of value 'TraceSession' is not valid

    - by Michael Freidgeim
    We are still using Enterprise Instrumentation(that was created during .Net 1.1 time)In new Server 2008 environment and IIS 7 we have the following errors:The 'sessionName' parameter of value 'TraceSession' is not valid. A trace session of this name does not exist in the TraceSessions configuration file for Windows Trace Session Manager service. Ensure that a session of this name exists in the TraceSessions configuration file and that the Windows Trace Session Manager service is started.   at Microsoft.EnterpriseInstrumentation.EventSinks.TraceEventSink..ctor(IDictionary parameters, EventSource eventSource)   --- End of inner exception stack trace ---   at System.RuntimeMethodHandle._InvokeConstructor(IRuntimeMethodInfo method, Object[] args, SignatureStruct& signature, RuntimeType declaringType)   at System.Reflection.RuntimeConstructorInfo.Invoke(BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)   at System.RuntimeType.CreateInstanceImpl(BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes)   at Microsoft.EnterpriseInstrumentation.EventSinks.EventSink.CreateNewEventSinks(DataRow[] eventSinkRows, EventSource eventSource)I’ve seen the same errors on development Win7 machines when using IIS. It seems not a problem on Cassini.I've checked ,that Windows Trace Session Manager Service has started and The file C:\Program Files (x86)\Microsoft Enterprise Instrumentation\Bin\Trace Service\TraceSessions.config has corresponding entry<?xml version="1.0" encoding="utf-8" ?><configuration >                <defaultParameters minBuffers="4" maxFileSize="10" maxBuffers="25" bufferSize="20" logFileMode="sequential" flushTimer="3" />                <sessionList>                                 <session name="TraceSession" enabled="false" fileName="C:\Program Files (x86)\Microsoft Enterprise Instrumentation\Bin\Trace Service\Logs\TraceLog.log" />                </sessionList></configuration>The errors still continue, but I was able to disable  the parameter in  eventSink configuration   <eventSink name=" traceSink" description=" Outputs events to the Windows Event Trace." type ="Microsoft.EnterpriseInstrumentation.EventSinks.TraceEventSink ">                <!-- MNF disabled parameter to  avoid error "The 'sessionName' parameter of value 'TraceSession' is not valid"                      < parameter name ="sessionName " value ="TraceSession " />                    -->    </ eventSink>Related old post http://bytes.com/topic/net/answers/104761-enterprise-instrumentation-windows-trace-session-managerOne day I wish to replace all EnterpriseInstrumentation calls with NLog.

    Read the article

  • Google I/O 2010 - Appstats - instrumentation for App Engine

    Google I/O 2010 - Appstats - instrumentation for App Engine Google I/O 2010 - Appstats - RPC instrumentation and optimizations for App Engine App Engine 201 Guido van Rossum Appstats is a pure userland library (for Python and Java) that inserts instrumentation hooks into the App Engine runtime at the interface between the runtime and services like the datastore. The collected statistics can be browsed in a rich UI which allows drilling down to various levels of detail. The talk will also discuss common optimizations to address typical findings. For all I/O 2010 sessions, please go to code.google.com From: GoogleDevelopers Views: 19 0 ratings Time: 59:31 More in Science & Technology

    Read the article

1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >