Search Results

Search found 447 results on 18 pages for 'rhino etl'.

Page 6/18 | < Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13  | Next Page >

  • External table and preprocessor for loading LOBs

    - by David Allan
    I was using the COLUMN TRANSFORMS syntax to load LOBs into Oracle using the Oracle external which is a handy way of doing several stuff - from loading LOBs from the filesystem to having constants as fields. In OWB you can use unbound external tables to define an external table using your own arbitrary access parameters - I blogged a while back on this for doing preprocessing before it was added into OWB 11gR2. For loading LOBs using the COLUMN TRANSFORMS syntax have a read through this post on loading CLOB, BLOB or any LOB, the files to load can be specified as a field that is a filename field, the content of this file will be the LOB data. So using the example from the linked post, you can define the columns; Then define the access parameters - if you go the unbound external table route you can can put whatever you want in here (your external table get out of jail free card); This will let you read the LOB files fromn the filesystem and use the external table in a mapping. Pushing the envelope a little further I then thought about marrying together the preprocessor with the COLUMN TRANSFORMS, this would have let me have a shell script for example as the preprocessor which listed the contents of a directory and let me read the files as LOBs via an external table. Unfortunately that doesn't quote work - there is now a bug/enhancement logged, so one day maybe. So I'm afraid my blog title was a little bit of a teaser....

    Read the article

  • OWB - 11.2.0.4 Windows standalone client released

    - by David Allan
    The 11.2.0.4 release of OWB containing the 32 bit and 64 bit standalone Windows client is released today, I had previously blogged about the Linux standalone client here. Big thanks to Anil for spearheading that, another milestone on the Data Integration roadmap. Below are the patch numbers; 17743124 - OWB 11.2.0.4 STANDALONE CLIENT FOR Windows 64 BIT 17743119 - OWB 11.2.0.4 STANDALONE CLIENT FOR Windows 32 BIT This is the terminal release of OWB and customer bugs will be resolved on top of this release. We are excited to share information on the Oracle Data Integration 12c release in our upcoming launch video webcast on November 12th.

    Read the article

  • Have You Downloaded SQL Server 2012 Evaluation Edition? Why Not?!

    - by andyleonard
    I am installing SQL Server 2012 Evaluation Edition on a virtual machine as I type. You can do this. Here’s one way: Grab some virtual machine software. I like Oracle VirtualBox . It’s cool. It’s free. Install VirtualBox. Download the 180-day free trial of Windows Server 2008 R2 . Also cool. Also free. Once Windows Server 2008 R2 is downloaded, build a VirtualBox VM. Download and install SQL Server 2012 Evaluation Edition ! That’s all there is to it. You can get started today, no need to wait until...(read more)

    Read the article

  • Data Warehouse ETL slow - change primary key in dimension?

    - by Jubbles
    I have a working MySQL data warehouse that is organized as a star schema and I am using Talend Open Studio for Data Integration 5.1 to create the ETL process. I would like this process to run once per day. I have estimated that one of the dimension tables (dimUser) will have approximately 2 million records and 23 columns. I created a small test ETL process in Talend that worked, but given the amount of data that may need to be updated daily, the current performance will not cut it. It takes the ETL process four minutes to UPDATE or INSERT 100 records to dimUser. If I assumed a linear relationship between the count of records and the amount of time to UPDATE or INSERT, then there is no way the ETL can finish in 3-4 hours (my hope), let alone one day. Since I'm unfamiliar with Java, I wrote the ETL as a Python script and ran into the same problem. Although, I did discover that if I did only INSERT, the process went much faster. I am pretty sure that the bottleneck is caused by the UPDATE statements. The primary key in dimUser is an auto-increment integer. My friend suggested that I scrap this primary key and replace it with a multi-field primary key (in my case, 2-3 fields). Before I rip the test data out of my warehouse and change the schema, can anyone provide suggestions or guidelines related to the design of the data warehouse the ETL process how realistic it is to have an ETL process INSERT or UPDATE a few million records each day will my friend's suggestion significantly help If you need any further information, just let me know and I'll post it. UPDATE - additional information: mysql> describe dimUser; Field Type Null Key Default Extra user_key int(10) unsigned NO PRI NULL auto_increment id_A int(10) unsigned NO NULL id_B int(10) unsigned NO NULL field_4 tinyint(4) unsigned NO 0 field_5 varchar(50) YES NULL city varchar(50) YES NULL state varchar(2) YES NULL country varchar(50) YES NULL zip_code varchar(10) NO 99999 field_10 tinyint(1) NO 0 field_11 tinyint(1) NO 0 field_12 tinyint(1) NO 0 field_13 tinyint(1) NO 1 field_14 tinyint(1) NO 0 field_15 tinyint(1) NO 0 field_16 tinyint(1) NO 0 field_17 tinyint(1) NO 1 field_18 tinyint(1) NO 0 field_19 tinyint(1) NO 0 field_20 tinyint(1) NO 0 create_date datetime NO 2012-01-01 00:00:00 last_update datetime NO 2012-01-01 00:00:00 run_id int(10) unsigned NO 999 I used a surrogate key because I had read that it was good practice. Since, from a business perspective, I want to keep aware of potential fraudulent activity (say for 200 days a user is associated with state X and then the next day they are associated with state Y - they could have moved or their account could have been compromised), so that is why geographic data is kept. The field id_B may have a few distinct values of id_A associated with it, but I am interested in knowing distinct (id_A, id_B) tuples. In the context of this information, my friend suggested that something like (id_A, id_B, zip_code) be the primary key. For the large majority of daily ETL processes (80%), I only expect the following fields to be updated for existing records: field_10 - field_14, last_update, and run_id (this field is a foreign key to my etlLog table and is used for ETL auditing purposes).

    Read the article

  • Data validate tools (ETL tools) for SQL server

    - by Stan
    I have some data in Excel and need to import into database. Is there any tool that can validate and maybe clean the data? Does Red Gate have such tool? The input will be Excel. Given table constraints, eg. CHECK, UNIQUE KEY, datetime format, NOT NULL. Desire output should be as least shows which lines are having problems, and then fix some trivial error automatically, like fill in default value for NULL columns, automatically correct datetime format. I know using Python can build such a script. But just wonder what's the popular way to do this. Thanks.

    Read the article

  • Using RhinoMocks, how do you mock or stub a concrete class without an empty constructor?

    - by Mark Rogers
    Mocking a concrete class with Rhino Mocks seems to work pretty easy when you have an empty constructor on a class: public class MyClass{ public MyClass() {} } But if I add a constructor that takes parameters and remove the one that doesn't take parameters: public class MyClass{ public MyClass(MyOtherClass instance) {} } I tend to get an exception: System.MissingMethodException : Can't find a constructor with matching arguments I've tried putting in nulls in my call to Mock or Stub, but it doesn't work. Can I create mocks or stubs of concrete classes with Rhino Mocks, or must I always supply (implicitly or explicitly) a parameter-less constructor?

    Read the article

  • Decent JavaScript IDE

    - by thatismatt
    What is a decent IDE for developing JavaScript, I'll be writing both client side stuff and writing for Rhino. Ideally It needs to run on Mac OSX, although something that runs on Windows too would be nice. ADDITIONAL: Having had a play with both js2 and Aptana, I think I'll be continuing to use Aptana. Mainly because I find emacs a bit hard to get my head round, although I did think that the error hi-lighting in js2 was better than that in Aptana. I'm still looking for a way to visually debug my js code that is running atop Rhino...

    Read the article

  • generated service mock: everything but RhinoMocks fails?

    - by hko
    I have the "quest" to search for the next Mocking Framework for my company, and basically it's down to NSubstitute (simplest syntax, but no strict mocks), FakeItEasy(best reviews, Roy Osherove bonus, and slightly better lib support than NSubstitute), Moq (best "other libs support", biggest featureset, downside: mock.Object). We definitely want to move on from RhinoMocks, e.g. because of the unusefull interactiontest error messages (it should tell me what the parameter was instead, when a verification fails). So I was pretty surprised the other day (that was yesterday) when I found out RhinoMocks could do a thing where every other mock framework fails at: Mocking an autogenerated SomethingService (a typical VS autogenerated service with a default construtor in a partial class). Please don't argue about the design.. I intend to write lightweight integration tests (and some unit tests), and I can't mess around with the service, the product is installed on too many customers system. See this code: // here the NSubstitute and FakeItEasy equivalents throw an exception.. see below TicketStoreService fakeTicketStoreService = MockRepository.GenerateMock<TicketStoreService>(); fakeTicketStoreService.Expect(service => service.DoSomething(Arg.Is(new Guid())).Return(new Guid()); fakeTicketStoreService.DoSomething(Arg.Is(new Guid())); fakeTicketStoreService.VerifyAllExpectations(); Note that DoSomething is a non-virtual methodcall in an autogenerated class. So it shouldn't work, according to common knowledge. But it does. Problem is that it's the only (non commercial) framework that can do this: Rhino.Mocks works, and verification works too FakeItEasy says it doesn't find a default constructor (probably just wrong exception message): No default constructor was found on the type SomeNamespace.TicketStoreService Moq gives something sane and understandable: Invalid setup on a non-virtual (overridable in VB) member: service=> service.DoSomething Nsubstitute gives a message System.NotSupportedException: Cannot serialize member System.ComponentModel.Component.Site of type System.ComponentModel.ISite because it is an interface. I'm really wondering what's going on here with the frameworks, except Moq. The "fancy new" frameworks seem to have an initial perf hit too, probably preparing some Type cache and serializing stuff, whilst RhinoMocks somehow manages to create a very "slim" mock without recursion. I have to admit I didn't like RhinoMocks very well, but here it shines.. unfortunately. So, is there a way to get that to work with newer (non-commercial!) mocking frameworks, or somehow get a sane error message out of Rhino.Mocks? And why can Rhino.Mocks achieve this, when clearly every Mocking framework states it can only work with virtual methods when given a concrete class? Let's not derail the discussion by talking about alternative approaches like Extract&Override or runtime-proxy Mocking frameworks like JustMock/TypeMock/Moles or the new Fakes framework, I know these, but that would be less ideal solutions, for reasons beyond this topic. Any help appreciated..

    Read the article

  • How to figure out which record has been deleted in an effiecient way?

    - by janetsmith
    Hi, I am working on an in-house ETL solution, from db1 (Oracle) to db2 (Sybase). We needs to transfer data incrementally (Change Data Capture?) into db2. I have only read access to tables, so I can't create any table or trigger in Oracle db1. The challenge I am facing is, how to detect record deletion in Oracle? The solution which I can think of, is by using additional standalone/embedded db (e.g. derby, h2 etc). This db contains 2 tables, namely old_data, new_data. old_data contains primary key field from tahle of interest in Oracle. Every time ETL process runs, new_data table will be populated with primary key field from Oracle table. After that, I will run the following sql command to get the deleted rows: SELECT old_data.id FROM old_data WHERE old_data.id NOT IN (SELECT new_data.id FROM new_data) I think this will be a very expensive operation when the volume of data become very large. Do you have any better idea of doing this? Thanks.

    Read the article

  • extract transform load

    - by mitch
    Wikipedia defines a 'typical' ETL cycle as : Cycle initiation Build reference data Extract (from sources) Validate Transform (clean, apply business rules, check for data integrity, create aggregates or disaggregates) Stage (load into staging tables, if used) Audit reports (for example, on compliance with business rules. Also, in case of failure, helps to diagnose/repair) Publish (to target tables) Archive Clean up ..What is meant by 'Build reference data'?

    Read the article

  • What are CAD apps written in, and how are they organized ?

    - by ldigas
    What are CAD applications (Rhino, Autocad) of today written in and how are they organized internally ? I gave as an example, Autocad and Rhino, although I would love to hear of other examples as well. I'm particularly interested in knowing what is their backend written in (multilanguage ?) and how is it organized, and how do they handle their frontend (GUI) in real time ? Do they use native windows API's or some libraries of their own, since I imagine, as good as may be, the open source solutions on today's market won't cut it. I may be wrong ... As most of you who have used them know, they handle amongs other things relatively complex rotational operations in realtime (shading is not interesting me). I've been doing some experiments with several packages recently, and for some larger models found that there is considerable difference in speed in, for example, programed rotation (big full ship models) amongst some of them (which I won't name). So I'm wondering about their internals ... Also, if someone knows of some book on the subject, I'd be interested to hear of it.

    Read the article

  • How to extract data from Google Analytics and build a data warehouse (webhouse) from it?

    - by nkaur301
    I have click stream data such as referring URL, top landing pages, top exit pages and metrics such as page views, number of visits, bounces all in Google Analytics. I am required to build a data warehouse from scratch(which I believe is known as web-house) from this data. My questions are:- 1)Is it possible? Every day data increases (some in terms of metrics or measures such as visits and some in terms of new referring sites), how would the process of loading the warehouse go about? 2)What ETL tool would help me to achieve this? Pentaho I believe has a way to pull out data from Google Analytics, has anyone used it? How does that process go? Any references, links would be appreciated besides answers.

    Read the article

  • Best way to mock WCF Client proxy

    - by chugh97
    Are there any ways to mock a WCF client proxy using Rhino mocks framework so I have access to the Channel property? I am trying to unit test Proxy.Close() method but as the proxy is constructed using the abstract base class ClientBast which has the ICommunication interface, my unit test is failing as the internal infrastructure of the class is absent in the mock object. Any good ways with code samples would be greatly appreciated.

    Read the article

  • Server side Javascript best practices?

    - by Petteri Hietavirta
    We have a CMS built on Java and it has Mozilla Rhino for the server side JS. At the moment the JS code base is small but growing. Before it is too late and code has become a horrible mess I want to introduce some best practices and coding style. Obviously the name space control is pretty important. But how about other best practices - especially for Java programmers?

    Read the article

  • Java 6 ScriptEngine and JSON.parse problem

    - by Tim
    The Rhino release that is included in Java 6 ScriptEngine does not have a JSON parser. I've tried including crockfords JSON2.js in my script on the scriptengine.eval(). When I try to do the JSON.parse, it ends up giving me a script error that .replace is an unknown function. .replace is referenced several places in JSON2, and it works fine inside a browser (IE7, IE8, FF3). Anyone see this and have a suggestion?

    Read the article

  • IgnoreArguments including an Action<T>

    - by James L
    I'm probably missing something obvious here, so apologies in advance! Using Rhino Mocks, how do I set an expectation that a method taking an Action will be called, but I want to use IgnoreArguments. Obviously I can't specify null as that isn't an Action, and I dont want any meaningless code in the test. As I said, it's probably obvious by the syntax is eluding me at the moment!

    Read the article

  • TDD a controller with ASP.NET MVC 2, NUnit and Rhine Mocks

    - by Nissan Fan
    What would a simple unit test look like to confirm that a certain controller exists if I am using Rhino Mocks, NUnit and ASP.NET MVC 2? I'm trying to wrap my head around the concept of TDD, but I can't see to figure out how a simple test like "Controller XYZ Exists" would look. In addition, what would the unit test look like to test an Action Result off a view?

    Read the article

  • how can protected members of base class be accessed during unit test?

    - by amateur
    I am creating a unit test in mstest with rhino mocks. I have a class A that inherits class B. I am testing class A and create an instance of it for my test. The class it inherits, "B", has some protected methods and protected properties that I would like to access for the benefit of my tests. For example, validate that a protected property on my base class has the expected value. Any ideas how I might access these protected properties of class B during my test?

    Read the article

  • How do I combine two interfaces when creating mocks?

    - by sduplooy
    We are using Rhino Mocks to perform some unit testing and need to mock two interfaces. Only one interface is implemented on the object and the other is implemented dynamically using an aspect-oriented approach. Is there an easy way to combine the two interfaces dynamically so that a mock can be created and the methods stubbed for both interfaces?

    Read the article

  • Empty data problem - data layer or DAL?

    - by luckyluke
    I designing the new App now and giving the following question a lot of thought. I consume a lot of data from the warehouse, and the entities have a lot of dictionary based values (currency, country, tax-whatever data) - dimensions. I cannot be assured though that there won't be nulls. So I am thinking: create an empty value in each of teh dictionaries with special keyID - ie. -1 do the ETL (ssis) do the correct stuff and insert -1 where it needs to let the DAL know that -1 is special (Static const whatever thing) don't care in the code to check for nullness of dictionary entries because THEY will always have a value But maybe I should be thinking: import data AS IS let the DAL do the thinking using empty record Pattern still don't care in the code because business layer will have what it needs from DAL. I think is more of a approach thing but maybe i am missing something important here... What do You think? Am i clear? Please don't confuse it with empty record problem. I do use emptyCustomer think all the time and other defaults too.

    Read the article

  • Handling primary key duplicates in a data warehouse load

    - by Meff
    I'm currently building an ETL system to load a data warehouse from a transactional system. The grain of my fact table is the transaction level. In order to ensure I don't load duplicate rows I've put a primary key on the fact table, which is the transaction ID. I've encountered a problem with transactions being reversed - In the transactional database this is done via a status, which I pick up and I can work out if the transaction is being done, or rolled back so I can load a reversal row in the warehouse. However, the reversal row will have the same transaction ID and so I get a primary key violation. I've solved this for now by negating the primary key, so transaction ID 1 would be a payment, and transaction ID -1 (In the warehouse only) would be the reversal. I have considered an alternative of generating a BIT column, where 0 is normal and 1 is reversal, then making the PK the transaction ID and the BIT column. My question is, is this a good practice, and has anyone else encountered anything like this? For reference, this is a payment processing system, so values will not be modified, so there will only ever be transactions and reversals.

    Read the article

  • Disable selected automated tests at runtime

    - by squig
    Is is posable to disable selected automated tests at runtime? I'm using VSTS and rhino mocks and have some intergation tests that require an external dependancy to be installed (MQ). Not all the developers on my team have this installed. Currently all the tests that require MQ inherit from a base class that checks if MQ is installed and if is not sets the test result to inconclusive. This works as it stops the tests from running, but marks the test run as unsuccseessful and can hide other failures. Any ideas?

    Read the article

< Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13  | Next Page >