Search Results

Search found 447 results on 18 pages for 'rhino etl'.

Page 6/18 | < Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13 | Next Page >

OWB 11gR2 – OMB and File Editing

- by David Allan

Here we will see how we can use the IDE for editing OMB scripts. The 11gR2 release is based on the common Oracle platform IDE used also by JDeveloper. It comes with a bunch of standard behavior for editing and rendering code. One of the lesser known things is that if you drop a text file into OWB you can edit it. So you can drop your tcl scripts right into OWB and edit in-place, and don’t need another IDE like Eclipse just for this task. Cool, so you have the file here. There may be no line numbers, you can toggle line numbers on by right clicking in the gutter. If we edit the file within the OWB IDE, the save is a little different from normal. OWB doesn’t normally manipulate files so things like ctrl-s to save, saves the OWB objects, but if you edit a file the closing of the file will ask if you want to save it – check it out. Now we enter the realm of ‘he who dares’…. Note the IDE doesn’t know about tcl files out of the box, so you see above there is no syntax highlighting. The code is identified by the extension… .java is java, .html is HTML etc. With OWB, the OMB scripts are tcl, we usually have .tcl extension on these files. One of the things we can do to trick up the syntax highlighting is to simply rename the file to have a .java suffix, then all of a sudden we get syntax highlighting, see the illustration here where side by side we see a the file with a .java extension and a .tcl extension. Not ideal pretending to be .java but gets us a way to having something more useful than notepad. We can then change the syntax highlighting such that we get Eclipse like highlighting within the IDE from the Tools Preferences option; You then get the Eclipse like rendering albeit using a little tweak on the file names… Might be useful if you are doing any kind of heavy duty OMB script development and just want a single IDE. The OMBPlus panel is then at hand for executing and testing it out.

Read the article
OWB - 11.2.0.4 Windows standalone client released

- by David Allan

The 11.2.0.4 release of OWB containing the 32 bit and 64 bit standalone Windows client is released today, I had previously blogged about the Linux standalone client here. Big thanks to Anil for spearheading that, another milestone on the Data Integration roadmap. Below are the patch numbers; 17743124 - OWB 11.2.0.4 STANDALONE CLIENT FOR Windows 64 BIT 17743119 - OWB 11.2.0.4 STANDALONE CLIENT FOR Windows 32 BIT This is the terminal release of OWB and customer bugs will be resolved on top of this release. We are excited to share information on the Oracle Data Integration 12c release in our upcoming launch video webcast on November 12th.

Read the article
Have You Downloaded SQL Server 2012 Evaluation Edition? Why Not?!

- by andyleonard

I am installing SQL Server 2012 Evaluation Edition on a virtual machine as I type. You can do this. Here’s one way: Grab some virtual machine software. I like Oracle VirtualBox . It’s cool. It’s free. Install VirtualBox. Download the 180-day free trial of Windows Server 2008 R2 . Also cool. Also free. Once Windows Server 2008 R2 is downloaded, build a VirtualBox VM. Download and install SQL Server 2012 Evaluation Edition ! That’s all there is to it. You can get started today, no need to wait until...(read more)

Read the article
Data Warehouse ETL slow - change primary key in dimension?

- by Jubbles

I have a working MySQL data warehouse that is organized as a star schema and I am using Talend Open Studio for Data Integration 5.1 to create the ETL process. I would like this process to run once per day. I have estimated that one of the dimension tables (dimUser) will have approximately 2 million records and 23 columns. I created a small test ETL process in Talend that worked, but given the amount of data that may need to be updated daily, the current performance will not cut it. It takes the ETL process four minutes to UPDATE or INSERT 100 records to dimUser. If I assumed a linear relationship between the count of records and the amount of time to UPDATE or INSERT, then there is no way the ETL can finish in 3-4 hours (my hope), let alone one day. Since I'm unfamiliar with Java, I wrote the ETL as a Python script and ran into the same problem. Although, I did discover that if I did only INSERT, the process went much faster. I am pretty sure that the bottleneck is caused by the UPDATE statements. The primary key in dimUser is an auto-increment integer. My friend suggested that I scrap this primary key and replace it with a multi-field primary key (in my case, 2-3 fields). Before I rip the test data out of my warehouse and change the schema, can anyone provide suggestions or guidelines related to the design of the data warehouse the ETL process how realistic it is to have an ETL process INSERT or UPDATE a few million records each day will my friend's suggestion significantly help If you need any further information, just let me know and I'll post it. UPDATE - additional information: mysql> describe dimUser; Field Type Null Key Default Extra user_key int(10) unsigned NO PRI NULL auto_increment id_A int(10) unsigned NO NULL id_B int(10) unsigned NO NULL field_4 tinyint(4) unsigned NO 0 field_5 varchar(50) YES NULL city varchar(50) YES NULL state varchar(2) YES NULL country varchar(50) YES NULL zip_code varchar(10) NO 99999 field_10 tinyint(1) NO 0 field_11 tinyint(1) NO 0 field_12 tinyint(1) NO 0 field_13 tinyint(1) NO 1 field_14 tinyint(1) NO 0 field_15 tinyint(1) NO 0 field_16 tinyint(1) NO 0 field_17 tinyint(1) NO 1 field_18 tinyint(1) NO 0 field_19 tinyint(1) NO 0 field_20 tinyint(1) NO 0 create_date datetime NO 2012-01-01 00:00:00 last_update datetime NO 2012-01-01 00:00:00 run_id int(10) unsigned NO 999 I used a surrogate key because I had read that it was good practice. Since, from a business perspective, I want to keep aware of potential fraudulent activity (say for 200 days a user is associated with state X and then the next day they are associated with state Y - they could have moved or their account could have been compromised), so that is why geographic data is kept. The field id_B may have a few distinct values of id_A associated with it, but I am interested in knowing distinct (id_A, id_B) tuples. In the context of this information, my friend suggested that something like (id_A, id_B, zip_code) be the primary key. For the large majority of daily ETL processes (80%), I only expect the following fields to be updated for existing records: field_10 - field_14, last_update, and run_id (this field is a foreign key to my etlLog table and is used for ETL auditing purposes).

Read the article
Data validate tools (ETL tools) for SQL server

- by Stan

I have some data in Excel and need to import into database. Is there any tool that can validate and maybe clean the data? Does Red Gate have such tool? The input will be Excel. Given table constraints, eg. CHECK, UNIQUE KEY, datetime format, NOT NULL. Desire output should be as least shows which lines are having problems, and then fix some trivial error automatically, like fill in default value for NULL columns, automatically correct datetime format. I know using Python can build such a script. But just wonder what's the popular way to do this. Thanks.

Read the article
Using RhinoMocks, how do you mock or stub a concrete class without an empty constructor?

- by Mark Rogers

Mocking a concrete class with Rhino Mocks seems to work pretty easy when you have an empty constructor on a class: public class MyClass{ public MyClass() {} } But if I add a constructor that takes parameters and remove the one that doesn't take parameters: public class MyClass{ public MyClass(MyOtherClass instance) {} } I tend to get an exception: System.MissingMethodException : Can't find a constructor with matching arguments I've tried putting in nulls in my call to Mock or Stub, but it doesn't work. Can I create mocks or stubs of concrete classes with Rhino Mocks, or must I always supply (implicitly or explicitly) a parameter-less constructor?

Read the article
generated service mock: everything but RhinoMocks fails?

- by hko

I have the "quest" to search for the next Mocking Framework for my company, and basically it's down to NSubstitute (simplest syntax, but no strict mocks), FakeItEasy(best reviews, Roy Osherove bonus, and slightly better lib support than NSubstitute), Moq (best "other libs support", biggest featureset, downside: mock.Object). We definitely want to move on from RhinoMocks, e.g. because of the unusefull interactiontest error messages (it should tell me what the parameter was instead, when a verification fails). So I was pretty surprised the other day (that was yesterday) when I found out RhinoMocks could do a thing where every other mock framework fails at: Mocking an autogenerated SomethingService (a typical VS autogenerated service with a default construtor in a partial class). Please don't argue about the design.. I intend to write lightweight integration tests (and some unit tests), and I can't mess around with the service, the product is installed on too many customers system. See this code: // here the NSubstitute and FakeItEasy equivalents throw an exception.. see below TicketStoreService fakeTicketStoreService = MockRepository.GenerateMock<TicketStoreService>(); fakeTicketStoreService.Expect(service => service.DoSomething(Arg.Is(new Guid())).Return(new Guid()); fakeTicketStoreService.DoSomething(Arg.Is(new Guid())); fakeTicketStoreService.VerifyAllExpectations(); Note that DoSomething is a non-virtual methodcall in an autogenerated class. So it shouldn't work, according to common knowledge. But it does. Problem is that it's the only (non commercial) framework that can do this: Rhino.Mocks works, and verification works too FakeItEasy says it doesn't find a default constructor (probably just wrong exception message): No default constructor was found on the type SomeNamespace.TicketStoreService Moq gives something sane and understandable: Invalid setup on a non-virtual (overridable in VB) member: service=> service.DoSomething Nsubstitute gives a message System.NotSupportedException: Cannot serialize member System.ComponentModel.Component.Site of type System.ComponentModel.ISite because it is an interface. I'm really wondering what's going on here with the frameworks, except Moq. The "fancy new" frameworks seem to have an initial perf hit too, probably preparing some Type cache and serializing stuff, whilst RhinoMocks somehow manages to create a very "slim" mock without recursion. I have to admit I didn't like RhinoMocks very well, but here it shines.. unfortunately. So, is there a way to get that to work with newer (non-commercial!) mocking frameworks, or somehow get a sane error message out of Rhino.Mocks? And why can Rhino.Mocks achieve this, when clearly every Mocking framework states it can only work with virtual methods when given a concrete class? Let's not derail the discussion by talking about alternative approaches like Extract&Override or runtime-proxy Mocking frameworks like JustMock/TypeMock/Moles or the new Fakes framework, I know these, but that would be less ideal solutions, for reasons beyond this topic. Any help appreciated..

Read the article
Decent JavaScript IDE

- by thatismatt

What is a decent IDE for developing JavaScript, I'll be writing both client side stuff and writing for Rhino. Ideally It needs to run on Mac OSX, although something that runs on Windows too would be nice. ADDITIONAL: Having had a play with both js2 and Aptana, I think I'll be continuing to use Aptana. Mainly because I find emacs a bit hard to get my head round, although I did think that the error hi-lighting in js2 was better than that in Aptana. I'm still looking for a way to visually debug my js code that is running atop Rhino...

Read the article
How to figure out which record has been deleted in an effiecient way?

- by janetsmith

Hi, I am working on an in-house ETL solution, from db1 (Oracle) to db2 (Sybase). We needs to transfer data incrementally (Change Data Capture?) into db2. I have only read access to tables, so I can't create any table or trigger in Oracle db1. The challenge I am facing is, how to detect record deletion in Oracle? The solution which I can think of, is by using additional standalone/embedded db (e.g. derby, h2 etc). This db contains 2 tables, namely old_data, new_data. old_data contains primary key field from tahle of interest in Oracle. Every time ETL process runs, new_data table will be populated with primary key field from Oracle table. After that, I will run the following sql command to get the deleted rows: SELECT old_data.id FROM old_data WHERE old_data.id NOT IN (SELECT new_data.id FROM new_data) I think this will be a very expensive operation when the volume of data become very large. Do you have any better idea of doing this? Thanks.

Read the article
What books should be in the modern datawarehouse architects shelf?

- by McPeterson

I have a few on mine. Basically just the set of Kimball books: The Data Warehouse LifeCycle The Data Warehouse ETL Toolkit The Data Warehouse Toolkit Are there any others that should be included? Or any other good ones you know? -mcpeterson

Read the article
extract transform load

- by mitch

Wikipedia defines a 'typical' ETL cycle as : Cycle initiation Build reference data Extract (from sources) Validate Transform (clean, apply business rules, check for data integrity, create aggregates or disaggregates) Stage (load into staging tables, if used) Audit reports (for example, on compliance with business rules. Also, in case of failure, helps to diagnose/repair) Publish (to target tables) Archive Clean up ..What is meant by 'Build reference data'?

Read the article
What books should be on the modern data warehouse architect's shelf?

- by McPeterson

I have a few on mine. Basically just the set of Kimball books: The Data Warehouse LifeCycle The Data Warehouse ETL Toolkit The Data Warehouse Toolkit Are there any others that should be included? Or any other good ones you know? -mcpeterson

Read the article
What are CAD apps written in, and how are they organized ?

- by ldigas

What are CAD applications (Rhino, Autocad) of today written in and how are they organized internally ? I gave as an example, Autocad and Rhino, although I would love to hear of other examples as well. I'm particularly interested in knowing what is their backend written in (multilanguage ?) and how is it organized, and how do they handle their frontend (GUI) in real time ? Do they use native windows API's or some libraries of their own, since I imagine, as good as may be, the open source solutions on today's market won't cut it. I may be wrong ... As most of you who have used them know, they handle amongs other things relatively complex rotational operations in realtime (shading is not interesting me). I've been doing some experiments with several packages recently, and for some larger models found that there is considerable difference in speed in, for example, programed rotation (big full ship models) amongst some of them (which I won't name). So I'm wondering about their internals ... Also, if someone knows of some book on the subject, I'd be interested to hear of it.

Read the article
How to extract data from Google Analytics and build a data warehouse (webhouse) from it?

- by nkaur301

I have click stream data such as referring URL, top landing pages, top exit pages and metrics such as page views, number of visits, bounces all in Google Analytics. I am required to build a data warehouse from scratch(which I believe is known as web-house) from this data. My questions are:- 1)Is it possible? Every day data increases (some in terms of metrics or measures such as visits and some in terms of new referring sites), how would the process of loading the warehouse go about? 2)What ETL tool would help me to achieve this? Pentaho I believe has a way to pull out data from Google Analytics, has anyone used it? How does that process go? Any references, links would be appreciated besides answers.

Read the article
Best way to mock WCF Client proxy

- by chugh97

Are there any ways to mock a WCF client proxy using Rhino mocks framework so I have access to the Channel property? I am trying to unit test Proxy.Close() method but as the proxy is constructed using the abstract base class ClientBast which has the ICommunication interface, my unit test is failing as the internal infrastructure of the class is absent in the mock object. Any good ways with code samples would be greatly appreciated.

Read the article
Server side Javascript best practices?

- by Petteri Hietavirta

We have a CMS built on Java and it has Mozilla Rhino for the server side JS. At the moment the JS code base is small but growing. Before it is too late and code has become a horrible mess I want to introduce some best practices and coding style. Obviously the name space control is pretty important. But how about other best practices - especially for Java programmers?

Read the article
Java 6 ScriptEngine and JSON.parse problem

- by Tim

The Rhino release that is included in Java 6 ScriptEngine does not have a JSON parser. I've tried including crockfords JSON2.js in my script on the scriptengine.eval(). When I try to do the JSON.parse, it ends up giving me a script error that .replace is an unknown function. .replace is referenced several places in JSON2, and it works fine inside a browser (IE7, IE8, FF3). Anyone see this and have a suggestion?

Read the article
IgnoreArguments including an Action<T>

- by James L

I'm probably missing something obvious here, so apologies in advance! Using Rhino Mocks, how do I set an expectation that a method taking an Action will be called, but I want to use IgnoreArguments. Obviously I can't specify null as that isn't an Action, and I dont want any meaningless code in the test. As I said, it's probably obvious by the syntax is eluding me at the moment!

Read the article
TDD a controller with ASP.NET MVC 2, NUnit and Rhine Mocks

- by Nissan Fan

What would a simple unit test look like to confirm that a certain controller exists if I am using Rhino Mocks, NUnit and ASP.NET MVC 2? I'm trying to wrap my head around the concept of TDD, but I can't see to figure out how a simple test like "Controller XYZ Exists" would look. In addition, what would the unit test look like to test an Action Result off a view?

Read the article
Is Entity Framework more mockable than Linq2Sql?

- by lance

I've found that Linq2Sql doesn't (Rhino) mock well, as the interfaces I need aren't there. Does EF generate code that's more mockable? NOTE: I'm not mocking, yet, without interfaces, the next reader of this question may not have my bias. EDIT: VS2008 / 3.5 for now.

Read the article
how can protected members of base class be accessed during unit test?

- by amateur

I am creating a unit test in mstest with rhino mocks. I have a class A that inherits class B. I am testing class A and create an instance of it for my test. The class it inherits, "B", has some protected methods and protected properties that I would like to access for the benefit of my tests. For example, validate that a protected property on my base class has the expected value. Any ideas how I might access these protected properties of class B during my test?

Read the article
How do I combine two interfaces when creating mocks?

- by sduplooy

We are using Rhino Mocks to perform some unit testing and need to mock two interfaces. Only one interface is implemented on the object and the other is implemented dynamically using an aspect-oriented approach. Is there an easy way to combine the two interfaces dynamically so that a mock can be created and the methods stubbed for both interfaces?

Read the article
Empty data problem - data layer or DAL?

- by luckyluke

I designing the new App now and giving the following question a lot of thought. I consume a lot of data from the warehouse, and the entities have a lot of dictionary based values (currency, country, tax-whatever data) - dimensions. I cannot be assured though that there won't be nulls. So I am thinking: create an empty value in each of teh dictionaries with special keyID - ie. -1 do the ETL (ssis) do the correct stuff and insert -1 where it needs to let the DAL know that -1 is special (Static const whatever thing) don't care in the code to check for nullness of dictionary entries because THEY will always have a value But maybe I should be thinking: import data AS IS let the DAL do the thinking using empty record Pattern still don't care in the code because business layer will have what it needs from DAL. I think is more of a approach thing but maybe i am missing something important here... What do You think? Am i clear? Please don't confuse it with empty record problem. I do use emptyCustomer think all the time and other defaults too.

Read the article
Handling primary key duplicates in a data warehouse load

- by Meff

I'm currently building an ETL system to load a data warehouse from a transactional system. The grain of my fact table is the transaction level. In order to ensure I don't load duplicate rows I've put a primary key on the fact table, which is the transaction ID. I've encountered a problem with transactions being reversed - In the transactional database this is done via a status, which I pick up and I can work out if the transaction is being done, or rolled back so I can load a reversal row in the warehouse. However, the reversal row will have the same transaction ID and so I get a primary key violation. I've solved this for now by negating the primary key, so transaction ID 1 would be a payment, and transaction ID -1 (In the warehouse only) would be the reversal. I have considered an alternative of generating a BIT column, where 0 is normal and 1 is reversal, then making the PK the transaction ID and the BIT column. My question is, is this a good practice, and has anyone else encountered anything like this? For reference, this is a payment processing system, so values will not be modified, so there will only ever be transactions and reversals.

Read the article
Disable selected automated tests at runtime

- by squig

Is is posable to disable selected automated tests at runtime? I'm using VSTS and rhino mocks and have some intergation tests that require an external dependancy to be installed (MQ). Not all the developers on my team have this installed. Currently all the tests that require MQ inherit from a base class that checks if MQ is installed and if is not sets the test result to inconclusive. This works as it stops the tests from running, but marks the test run as unsuccseessful and can hide other failures. Any ideas?

Read the article

< Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13 | Next Page >