Search Results

Search found 266 results on 11 pages for 'etl'.

Page 10/11 | < Previous Page | 6 7 8 9 10 11 | Next Page >

Archiving SQLHelp tweets

- by jamiet

#SQLHelp is a Twitter hashtag that can be used by any Twitter user to get help from the SQL Server community. I think its fair to say that in its first year of being it has proved to be a very useful resource however Kendra Little (@kendra_little) made a very salient point yesterday when she tweeted: Is there a way to search the archives of #sqlhelp Trying to remember answer to a question I know I saw a couple months ago http://twitter.com/#!/Kendra_Little/status/15538234184441856 This highlights an inherent problem with Twitter’s search capability – it simply does not reach far enough back in time. I have made steps to remedy that situation by putting into place two initiatives to archive Tweets that contain the #sqlhelp hashtag. The Archivist http://archivist.visitmix.com/ is a free service that, quite simply, archives a history of tweets that contain a given search term by periodically polling Twitter’s search service with that search term and subsequently displaying a dashboard providing an aggregate view of those tweets for things like tweet volume over time, top users and top words (Archivist FAQ). I have set up an archive on The Archivist for “sqlhelp” which you can view at http://archivist.visitmix.com/jamiet/7. Here is a screenshot of the SQLHelp dashboard 36 minutes after I set it up: There is lots of good information in there, including the fact that Jonathan Kehayias (@SQLSarg) is the most active SQLHelp tweeter (I suspect as an answerer rather than a questioner ) and that SSIS has proven to be a rather (ahem) popular subject!! Datasift The Archivist has its uses though for our purposes it has a couple of downsides. For starters you cannot search through an archive (which is what Kendra was after) and nor can you export the contents of the archive for offline analysis. For those functions we need something a bit more heavyweight and for that I present to you Datasift. Datasift is a tool (currently an alpha release) that allows you to search for tweets and provide them through an object called a Datasift stream. That sounds very similar to normal Twitter search though it has one distinct advantage that other Twitter search tools do not – Datasift has access to Twitter’s Streaming API (aka the Twitter Firehose). In addition it has access to a lot of other rather nice features: It provides the Datasift API that allows you to consume the output of a Datasift stream in your tool of choice (bring on my favourite ultimate mashup tool J ) It has a query language (called Filtered Stream Definition Language – FSDL for short) A Datasift stream can consume (and filter) other Datasift streams Datasift can (and does) consume services other than Twitter If I refer to Datasift as “ETL for tweets” then you may get some sort of idea what it is all about. Just as I did with The Archivist I have set up a publicly available Datasift stream for “sqlhelp” at http://datasift.net/stream/1581/sqlhelp. Here is the FSDL query that provides the data: twitter.text contains "sqlhelp" Pretty simple eh? At the current time it provides little more than a rudimentary dashboard but as Datasift is currently an alpha release I think this may be worth keeping an eye on. The real value though is the ability to consume the output of a stream via Datasift’s RESTful API, observe: http://api.datasift.net/stream.xml?stream_identifier=c7015255f07e982afdeebdf1ae6e3c0d&username=jamiet&api_key=XXXXXXX (Note that an api_key is required during the alpha period so, given that I’m not supplying my api_key, this URI will not work for you) Just to prove that a Datasift stream can indeed consume data from another stream I have set up a second stream that further filters the first one for tweets containing “SSIS”. That one is at http://datasift.net/stream/1586/ssis-sqlhelp and here is the FSDL query: rule "414c9845685ff8d2548999cf3162e897" and (interaction.content contains "ssis") When Datasift moves beyond alpha I’ll re-assess how useful this is going to be and post a follow-up blog. @Jamiet

Read the article
My Take on Hadoop World 2011

- by Jean-Pierre Dijcks

I’m sure some of you have read pieces about Hadoop World and I did see some headlines which were somewhat, shall we say, interesting? I thought the keynote by Larry Feinsmith of JP Morgan Chase & Co was one of the highlights of the conference for me. The reason was very simple, he addressed some real use cases outside of internet and ad platforms. The following are my notes, since the keynote was recorded I presume you can go and look at Hadoopworld.com at some point… On the use cases that were mentioned: ETL – how can I do complex data transformation at scale Doing Basel III liquidity analysis Private banking – transaction filtering to feed [relational] data marts Common Data Platform – a place to keep data that is (or will be) valuable some day, to someone, somewhere 360 Degree view of customers – become pro-active and look at events across lines of business. For example make sure the mortgage folks know about direct deposits being stopped into an account and ensure the bank is pro-active to service the customer Treasury and Security – Global Payment Hub [I think this is really consolidation of data to cross reference activity across business and geographies] Data Mining Bypass data engineering [I interpret this as running a lot of a large data set rather than on samples] Fraud prevention – work on event triggers, say a number of failed log-ins to the website. When they occur grab web logs, firewall logs and rules and start to figure out who is trying to log in. Is this me, who forget his password, or is it someone in some other country trying to guess passwords Trade quality analysis – do a batch analysis or all trades done and run them through an analysis or comparison pipeline One of the key requests – if you can say it like that – was for vendors and entrepreneurs to make sure that new tools work with existing tools. JPMC has a large footprint of BI Tools and Big Data reporting and tools should work with those tools, rather than be separate. Security and Entitlement – how to protect data within a large cluster from unwanted snooping was another topic that came up. I thought his Elephant ears graph was interesting (couldn’t actually read the points on it, but the concept certainly made some sense) and it was interesting – when asked to show hands – how the audience did not (!) think that RDBMS and Hadoop technology would overlap completely within a few years. Another interesting session was the session from Disney discussing how Disney is building a DaaS (Data as a Service) platform and how Hadoop processing capabilities are mixed with Database technologies. I thought this one of the best sessions I have seen in a long time. It discussed real use case, where problems existed, how they were solved and how Disney planned some of it. The planning focused on three things/phases: Determine the Strategy – Design a platform and evangelize this within the organization Focus on the people – Hire key people, grow and train the staff (and do not overload what you have with new things on top of their day-to-day job), leverage a partner with experience Work on Execution of the strategy – Implement the platform Hadoop next to the other technologies and work toward the DaaS platform This kind of fitted with some of the Linked-In comments, best summarized in “Think Platform – Think Hadoop”. In other words [my interpretation], step back and engineer a platform (like DaaS in the Disney example), then layer the rest of the solutions on top of this platform. One general observation, I got the impression that we have knowledge gaps left and right. On the one hand are people looking for more information and details on the Hadoop tools and languages. On the other I got the impression that the capabilities of today’s relational databases are underestimated. Mostly in terms of data volumes and parallel processing capabilities or things like commodity hardware scale-out models. All in all I liked this conference, it was great to chat with a wide range of people on Oracle big data, on big data, on use cases and all sorts of other stuff. Just hope they get a set of bigger rooms next time… and yes, I hope I’m going to be back next year!

Read the article
Big Data – Interacting with Hadoop – What is Sqoop? – What is Zookeeper? – Day 17 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the Pig and Pig Latin in Big Data Story. In this article we will understand what is Sqoop and Zookeeper in Big Data Story. There are two most important components one should learn when learning about interacting with Hadoop – Sqoop and Zookper. What is Sqoop? Most of the business stores their data in RDBMS as well as other data warehouse solutions. They need a way to move data to the Hadoop system to do various processing and return it back to RDBMS from Hadoop system. The data movement can happen in real time or at various intervals in bulk. We need a tool which can help us move this data from SQL to Hadoop and from Hadoop to SQL. Sqoop (SQL to Hadoop) is such a tool which extract data from non-Hadoop data sources and transform them into the format which Hadoop can use it and later it loads them into HDFS. Essentially it is ETL tool where it Extracts, Transform and Load from SQL to Hadoop. The best part is that it also does extract data from Hadoop and loads them to Non-SQL (or RDBMS) data stores. Essentially, Sqoop is a command line tool which does SQL to Hadoop and Hadoop to SQL. It is a command line interpreter. It creates MapReduce job behinds the scene to import data from an external database to HDFS. It is very effective and easy to learn tool for nonprogrammers. What is Zookeeper? ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. In other words Zookeeper is a replicated synchronization service with eventual consistency. In simpler words – in Hadoop cluster there are many different nodes and one node is master. Let us assume that master node fails due to any reason. In this case, the role of the master node has to be transferred to a different node. The main role of the master node is managing the writers as that task requires persistence in order of writing. In this kind of scenario Zookeeper will assign new master node and make sure that Hadoop cluster performs without any glitch. Zookeeper is the Hadoop’s method of coordinating all the elements of these distributed systems. Here are few of the tasks which Zookeepr is responsible for. Zookeeper manages the entire workflow of starting and stopping various nodes in the Hadoop’s cluster. In Hadoop cluster when any processes need certain configuration to complete the task. Zookeeper makes sure that certain node gets necessary configuration consistently. In case of the master node fails, Zookeepr can assign new master node and make sure cluster works as expected. There many other tasks Zookeeper performance when it is about Hadoop cluster and communication. Basically without the help of Zookeeper it is not possible to design any new fault tolerant distributed application. Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Big Data Analytics. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
PowerShell: Read Excel to Create Inserts

- by BuckWoody

I’m writing a series of articles on how to migrate “departmental” data into SQL Server. I also hold workshops on the entire process – from discovering that the data exists to the modeling process and then how to design the Extract, Transform and Load (ETL) process. Finally I write about (and teach) a few methods on actually moving the data. One of those options is to use PowerShell. There are a lot of ways even with that choice, but the one I show is to read two columns from the spreadsheet and output statements that would insert the data using a stored procedure. Of course, you could re-write this as INSERT statements, out to a text file for bcp, or even use a database connection in the script to move the data directly from Excel into SQL Server. This snippet won’t run on your system, of course – it assumes a Microsoft Office Excel 2007 spreadsheet located at c:\temp called VendorList.xlsx. It looks for a tab in that spreadsheet called Vendors. The statement that does the writing just uses one column: Vendor Code. Here’s the breakdown of what I’m doing: In the first block, I connect to Microsoft Office Excel. That connection string is specific to Excel 2007, so if you need a different version you’ll need to look that up. In the second block I set up a selection from the entire spreadsheet based on that tab. Note that if you’re only after certain data you shouldn’t get the whole spreadsheet – that’s just good practice. In the next block I create the text I want, inserting the Vendor Code field as I go. Finally I close the connection. Enjoy! $ExcelConnection= New-Object -com "ADODB.Connection" $ExcelFile="c:\temp\VendorList.xlsx" $ExcelConnection.Open("Provider=Microsoft.ACE.OLEDB.12.0;` Data Source=$ExcelFile;Extended Properties=Excel 12.0;") $strQuery="Select * from [Vendors$]" $ExcelRecordSet=$ExcelConnection.Execute($strQuery) do { Write-Host "EXEC sp_InsertVendors '" $ExcelRecordSet.Fields.Item("Vendor Code").Value "'" $ExcelRecordSet.MoveNext()} Until ($ExcelRecordSet.EOF) $ExcelConnection.Close() Script Disclaimer, for people who need to be told this sort of thing: Never trust any script, including those that you find here, until you understand exactly what it does and how it will act on your systems. Always check the script on a test system or Virtual Machine, not a production system. All scripts on this site are performed by a professional stunt driver on a closed course. Your mileage may vary. Void where prohibited. Offer good for a limited time only. Keep out of reach of small children. Do not operate heavy machinery while using this script. If you experience blurry vision, indigestion or diarrhea during the operation of this script, see a physician immediately. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
Logging errors caused by exceptions deep in the application

- by Kaleb Pederson

What are best-practices for logging deep within an application's source? Is it bad practice to have multiple event log entries for a single error? For example, let's say that I have an ETL system whose transform step involves: a transformer, pipeline, processing algorithm, and processing engine. In brief, the transformer takes in an input file, parses out records, and sends the records through the pipeline. The pipeline aggregates the results of the processing algorithm (which could do serial or parallel processing). The processing algorithm sends each record through one or more processing engines. So, I have at least four levels: Transformer - Pipeline - Algorithm - Engine. My code might then look something like the following: class Transformer { void Process(InputSource input) { try { var inRecords = _parser.Parse(input.Stream); var outRecords = _pipeline.Transform(inRecords); } catch (Exception ex) { var inner = new ProcessException(input, ex); _logger.Error("Unable to parse source " + input.Name, inner); throw inner; } } } class Pipeline { IEnumerable<Result> Transform(IEnumerable<Record> records) { // NOTE: no try/catch as I have no useful information to provide // at this point in the process var results = _algorithm.Process(records); // examine and do useful things with results return results; } } class Algorithm { IEnumerable<Result> Process(IEnumerable<Record> records) { var results = new List<Result>(); foreach (var engine in Engines) { foreach (var record in records) { try { engine.Process(record); } catch (Exception ex) { var inner = new EngineProcessingException(engine, record, ex); _logger.Error("Engine {0} unable to parse record {1}", engine, record); throw inner; } } } } } class Engine { Result Process(Record record) { for (int i=0; i<record.SubRecords.Count; ++i) { try { Validate(record.subRecords[i]); } catch (Exception ex) { var inner = new RecordValidationException(record, i, ex); _logger.Error( "Validation of subrecord {0} failed for record {1}", i, record ); } } } } There's a few important things to notice: A single error at the deepest level causes three log entries (ugly? DOS?) Thrown exceptions contain all important and useful information Logging only happens when failure to do so would cause loss of useful information at a lower level. Thoughts and concerns: I don't like having so many log entries for each error I don't want to lose important, useful data; the exceptions contain all the important but the stacktrace is typically the only thing displayed besides the message. I can log at different levels (e.g., warning, informational) The higher level classes should be completely unaware of the structure of the lower-level exceptions (which may change as the different implementations are replaced). The information available at higher levels should not be passed to the lower levels. So, to restate the main questions: What are best-practices for logging deep within an application's source? Is it bad practice to have multiple event log entries for a single error?

Read the article
PASS Summit 2012: keynote and Mobile BI announcements #sqlpass

- by Marco Russo (SQLBI)

Today at PASS Summit 2012 there have been several announcements during the keynote. Moreover, other news have not been highlighted in the keynote but are equally if not more important for the BI community. Let’s start from the big news in the keynote (other details on SQL Server Blog): Hekaton: this is the codename for in-memory OLTP technology that will appear (I suppose) in the next release of the SQL Server relational engine. The improvement in performance and scalability is impressive and it enables new scenarios. I’m curious to see whether it can be used also to improve ETL performance and how it differs from using SSD technology. Updates on Columnstore: In the next major release of SQL Server the columnstore indexes will be updatable and it will be possible to create a clustered index with Columnstore index. This is really a great news for near real-time reporting needs! Polybase: in 2013 it will debut SQL Server 2012 Parallel Data Warehouse (PDW), which will include the Polybase technology. By using Polybase a single T-SQL query will run queries across relational data and Hadoop data. A single query language for both. Sounds really interesting for using BigData in a more integrated way with existing relational databases. And, of course, to load a data warehouse using BigData, which is the ultimate goal that we all BI Pro have, right? SQL Server 2012 SP1: the Service Pack 1 for SQL Server 2012 is available now and it enable the use of PowerPivot for SharePoint and Power View on a SharePoint 2013 installation with Excel 2013. Power View works with Multidimensional cube: the long-awaited feature of being able to use PowerPivot with Multidimensional cubes has been shown by Amir Netz in an amazing demonstration during the keynote. The interesting thing is that the data model behind was based on a many-to-many relationship (something that is not fully supported by Power View with Tabular models). Another interesting aspect is that it is Analysis Services 2012 that supports DAX queries run on a Multidimensional model, enabling the use of any future tool generating DAX queries on top of a Multidimensional model. There are still no info about availability by now, but this is *not* included in SQL Server 2012 SP1. So what about Mobile BI? Well, even if not announced during the keynote, there is a dedicated session on this topic and there are very important news in this area: iOS, Android and Microsoft mobile platforms: the commitment is to get data exploration and visualization capabilities working within June 2013. This should impact at least Power View and SharePoint/Excel Services. This is the type of UI experience we are all waiting for, in order to satisfy the requests coming from users and customers. The important news here is that native applications will be available for both iOS and Windows 8 so it seems that Android will be supported initially only through the web. Unfortunately we haven’t seen any demo, so it’s not clear what will be the offline navigation experience (and whether there will be one). But at least we know that Microsoft is working on native applications in this area. I’m not too surprised that HTML5 is not the magic bullet for all the platforms. The next PASS Business Analytics conference in 2013 seems a good place to see this in action, even if I hope we don’t have to wait other six months before seeing some demo of native BI applications on mobile platforms! Viewing Reporting Services reports on iPad is supported starting with SQL Server 2012 SP1, which has been released today. This is another good reason to install SP1 on SQL Server 2012. If you are at PASS Summit 2012, come and join me, Alberto Ferrari and Chris Webb at our book signing event tomorrow, Thursday 8 2012, at the bookstore between 12:00pm and 12:30pm, or follow one of our sessions!

Read the article
Fetching Partition Information

- by Mike Femenella

For a recent SSIS package at work I needed to determine the distinct values in a partition, the number of rows in each partition and the file group name on which each partition resided in order to come up with a grouping mechanism. Of course sys.partitions comes to mind for some of that but there are a few other tables you need to link to in order to grab the information required. The table I’m working on contains 8.8 billion rows. Finding the distinct partition keys from this table was not a fast operation. My original solution was to create a temporary table, grab the distinct values for the partitioned column, then update via sys.partitions for the rows and the $partition function for the partitionid and finally look back to the sys.filegroups table for the filegroup names. It wasn’t pretty, it could take up to 15 minutes to return the results. The primary issue is pulling distinct values from the table. Queries for distinct against 8.8 billion rows don’t go quickly. A few beers into a conversation with a friend and we ended up talking about work which led to a conversation about the task described above. The solution was already built in SQL Server, just needed to pull it together. The first table I needed was sys.partition_range_values. This contains one row for each range boundary value for a partition function. In my case I have a partition function which uses dayid values. For example July 4th would be represented as an int, 20130704. This table lists out all of the dayid values which were defined in the function. This eliminated the need to query my source table for distinct dayid values, everything I needed was already built in here for me. The only caveat was that in my SSIS package I needed to create a bucket for any dayid values that were out of bounds for my function. For example if my function handled 20130501 through 20130704 and I had day values of 20130401 or 20130705 in my table, these would not be listed in sys.partition_range_values. I just created an “everything else” bucket in my ssis package just in case I had any dayid values unaccounted for. To get the number of rows for a partition is very easy. The sys.partitions table contains values for each partition. Easy enough to achieve by querying for the object_id and index value of 1 (the clustered index) The final piece of information was the filegroup name. There are 2 options available to get the filegroup name, sys.data_spaces or sys.filegroups. For my query I chose sys.filegroups but really it’s a matter of preference and data needs. In order to bridge between sys.partitions table and either sys.data_spaces or sys.filegroups you need to get the container_id. This can be done by joining sys.allocation_units.container_id to the sys.partitions.hobt_id. sys.allocation_units contains the field data_space_id which then lets you join in either sys.data_spaces or sys.file_groups. The end result is the query below, which typically executes for me in under 1 second. I’ve included the join to sys.filegroups and to sys.dataspaces, and I’ve just commented out the join sys.filegroups. As I mentioned above, this shaves a good 10-15 minutes off of my original ssis package and is a really easy tweak to get a boost in my ETL time. Enjoy.

Read the article
Loading and binding a serialized view model to a WPF window?

- by generalt

Hello all. I'm writing a one-window UI for a simple ETL tool. The UI consists of the window, the code behind for the window, a view model for the window, and the business logic. I wanted to provide functionality to the users to save the state of the UI because the content of about 10-12 text boxes will be reused between sessions, but are specific to the user. I figured I could serialize the view model, which contains all the data from the textboxes, and this works fine, but I'm having trouble loading the information in the serialized XML file back into the text boxes. Constructor of window: public ETLWindow() { InitializeComponent(); _viewModel = new ViewModel(); this.DataContext = _viewModel; _viewModel.State = Constants.STATE_IDLE; Loaded += new RoutedEventHandler(MainWindow_Loaded); } XAML: <TextBox x:Name="targetDirectory" IsReadOnly="true" Text="{Binding TargetDatabaseDirectory, UpdateSourceTrigger=PropertyChanged}"/> ViewModel corresponding property: private string _targetDatabaseDirectory; [XmlElement()] public string TargetDatabaseDirectory { get { return _targetDatabaseDirectory; } set { _targetDatabaseDirectory = value; OnPropertyChanged(DataUtilities.General.Utilities.GetPropertyName(() => new ViewModel().TargetDatabaseDirectory)); } Load event in code behind: private void loadState_Click(object sender, RoutedEventArgs e) { string statePath = this.getFilePath(); _viewModel = ViewModel.LoadModel(statePath); } As you can guess, the LoadModel method deserializes the serialized file on the user's drive. I couldn't find much on the web regarding this issue. I know this probably has something to do with my bindings. Is there some way to refresh on the bindings on the XAML after I deserialize the view model? Or perhaps refresh all properties on the view model? Or am I completely insane thinking any of this could be done? Thanks.

Read the article
SSAS Cube reprocessing fails - then succeeds if I try again

- by EdgarVerona

So I'm basically brand new to the concept of BI, and I've inherited an existing ETL process that is a two step process: 1) Loads the data into a database that is only used by the cube processing 2) Starts off the SSAS cube processing against said database It seems pretty well isolated, but occasionally (once a week, sometimes twice) it will fail with the following exception: "Errors in the OLAP storage engine: The attribute key cannot be found" Now the interesting thing is that: 1) The dimension having the issue is not usually the same one (i.e. there's no single dimension that consistently has this failure) 2) The source table, when I inspect it, does actually contain the attribute key that it says could not be found And, most interestingly... 3) If I then immediately reprocess the dimensions and cubes manually through SSMS, they reprocess successfully and without incident. In both the aforementioned job and when I reprocess them through SSMS, I am using "ProcessFull", so it should be reprocessing them completely. Has anyone run into such an issue? I'm scratching my head about it... because if it was a genuine data integrity issue, reprocessing the cube again wouldn't fix it. What on earth could be happening? I've been tasked with finding out why this happens, but I can neither reproduce it consistently nor can I point to a data integrity problem as the root cause. Thanks for any input you can provide!

Read the article
How to restrict access to a class's data based on state?

- by Marcus Swope

In an ETL application I am working on, we have three basic processes: Validate and parse an XML file of customer information from a third party Match values received in the file to values in our system Load customer data in our system The issue here is that we may need to display the customer information from any or all of the above states to an internal user and there is data in our customer class that will never be populated before the values have been matched in our system (step 2). For this reason, I would like to have the values not even be available to be accessed when the customer is in this state, and I would like to have to avoid some repeated logic everywhere like: if (customer.IsMatched) DisplayTextOnWeb(customer.SomeMatchedValue); My first thought for this was to add a couple interfaces on top of Customer that would only expose the properties and behaviors of the current state, and then only deal with those interfaces. The problem with this approach is that there seems to be no good way to move from an ICustomerWithNoMatchedValues to an ICustomerWithMatchedValues without doing direct casts, etc... (or at least I can't find one). I can't be the first to have come across this, how do you normally approach this? As a last caveat, I would like for this solution to play nice with FluentNHibernate :) Thanks in advance...

Read the article
Which isolation level should I use for the following insert-if-not-present transaction?

- by Steve Guidi

I've written a linq-to-sql program that essentially performs an ETL task, and I've noticed many places where parallelization will improve its performance. However, I'm concerned about preventing uniquness constraint violations when two threads perform the following task (psuedo code). Record CreateRecord(string recordText) { using (MyDataContext database = GetDatabase()) { Record existingRecord = database.MyTable.FirstOrDefault(record.KeyPredicate()); if(existingRecord == null) { existingRecord = CreateRecord(recordText); database.MyTable.InsertOnSubmit(existingRecord); } database.SubmitChanges(); return existingRecord; } } In general, this code executes a SELECT statement to test for record existance, followed by an INSERT statement if the record doesn't exist. It is encapsulated by an implicit transaction. When two threads run this code for the same instance of recordText, I want to prevent them from simultaneously determining that the record doesn't exist, thereby both attempting to create the same record. An isolation level and explicit transaction will work well, except I'm not certain which isolation level I should use -- Serializable should work, but seems too strict. Is there a better choice?

Read the article
server 2008 r2 - wbadmin systemstatebackup - system writer not found in the backup

- by TWood

I am trying to manually run a systemstatebackup command on my server 2008 r2 box and I am getting an error code '2155347997' when I view the backup event log details. The command line tells me that I have log files written to the c:\windows\logs\windowsserverbackup\ path but I have no files of the .log type there. My command window tells me "System Writer is not found in the backup". However when I run vssadmin list writers I find System Writer in the list and it shows normal status with no last errors stored. I am running this from an elevated command prompt as well as from a logged on administrator account. My backup target path has permission for network service to have full control and it has plenty of free space. Looking in eventlog I have two VSS error 8194 that happen immediately before the Backup error 517 which has the errorcode 2155347997 listed. All three of these errors are a result of trying to run the command for the systemstatebackup. It's my belief that some VSS related permission is failing and exiting the backup process before it ever gets started. Because of this the initial code that creates the log files must not be running and this is why I have no files. When running the systemstatebackup command from the command prompt and watching the windowsserverbackup directory I do see that I have a Wbadmin.0.etl file which gets created but it is deleted when the backup errors out and stops. I have looked online and there are numerous opinions as to the cause of this error. These are the things I have corrected to try and fix this issue before posting here: Machine runs a HP 1410i smart array controller but at one time also used a LSI scsi card. Used networkadminkb.com's kb# a467 to find one LSI_SCSI entry in HKLMSysCurrentControlSetServices which start was set to 0x0 and I modified to 0x3. No changes. In HKLMSystemCurrentControlSetServicesVSSDiag I gave network service full control where it previously only had "Special Permission". No changes. I followed KB2009272 to manually try to fix system writer. These are all of the things I have tried. What else should I look at to resolve this issue? It may be important to note that I run Mozy Pro on this server and that was known in the past to use VSS for copying operations and it occasionally threw an error. However since an update last year those error event log entries have stopped.

Read the article
T-SQL Tuesday #21 - Crap!

- by Most Valuable Yak (Rob Volk)

Adam Machanic's (blog | twitter) ever popular T-SQL Tuesday series is being held on Wednesday this time, and the topic is… SHIT CRAP. No, not fecal material. But crap code. Crap SQL. Crap ideas that you thought were good at the time, or were forced to do due (doo-doo?) to lack of time. The challenge for me is to look back on my SQL Server career and find something that WASN'T crap. Well, there's a lot that wasn't, but for some reason I don't remember those that well. So the additional challenge is to pick one particular turd that I really wish I hadn't squeezed out. Let's see if this outline fits the bill: An ETL process on text files; That had to interface between SQL Server and an AS/400 system; That didn't use SSIS (should have) or BizTalk (ummm, no) but command-line scripting, using Unix utilities(!) via: xp_cmdshell; That had to email reports and financial data, some of it sensitive Yep, the stench smell is coming back to me now, as if it was yesterday… As to why SSIS and BizTalk were not options, basically I didn't know either of them well enough to get the job done (and I still don't). I also had a strict deadline of 3 days, in addition to all the other responsibilities I had, so no time to learn them. And seeing how screwed up the rest of the process was: Payment files from multiple vendors in multiple formats; Sent via FTP, PGP encrypted email, or some other wizardry; Manually opened/downloaded and saved to a particular set of folders (couldn't change this); Once processed, had to be placed BACK in the same folders with the original archived; x2 divisions that had to run separately; Plus an additional vendor file in another format on a completely different schedule; So that they could be MANUALLY uploaded into the AS/400 system (couldn't change this either, even if it was technically possible) I didn't feel so bad about the solution I came up with, which was naturally: Copy the payment files to the local SQL Server drives, using xp_cmdshell Run batch files (via xp_cmdshell) to parse the different formats using sed, a Unix utility (this was before Powershell) Use other Unix utilities (join, split, grep, wc) to process parsed files and generate metadata (size, date, checksum, line count) Run sqlcmd to execute a stored procedure that passed the parsed file names so it would bulk load the data to do a comparison bcp the compared data out to ANOTHER text file so that I could grep that data out of the original file Run another stored procedure to import the matched data into SQL Server so it could process the payments, including file metadata Process payment batches and log which division and vendor they belong to Email the payment details to the finance group (since it was too hard for them to run a web report with the same data…which they ran anyway to compare the emailed file against…which always matched, surprisingly) Email another report showing unmatched payments so they could manually void them…about 3 months afterward All in "Excel" format, using xp_sendmail (SQL 2000 system) Copy the unmatched data back to the original folder locations, making sure to match the file format exactly (if you've ever worked with ACH files, you'll understand why this sucked) If you're one of the 10 people who have read my blog before, you know that I love the DOS "for" command. Like passionately. Like fairy-tale love. So my batch files were riddled with for loops, nested within other for loops, that called other batch files containing for loops. I think there was one section that had 4 or 5 nested for commands. It was wrong, disturbed, and completely un-maintainable by anyone, even myself. Months, even a year, after I left the company I got calls from someone who had to make a minor change to it, and they called me to talk them out of spraying the office with an AK-47 after looking at this code. (for you Star Trek TOS fans) The funniest part of this, well, one of the funniest, is that I made the deadline…sort of, I was only a day late…and the DAMN THING WORKED practically unchanged for 3 years. Most of the problems came from the manual parts of the overall process, like forgetting to decrypt the files, or missing/late files, or saved to the wrong folders. I'm definitely not trying to toot my own horn here, because this was truly one of the dumbest, crappiest solutions I ever came up with. Fortunately as far as I know it's no longer in use and someone has written a proper replacement. Today I would knuckle down and do it in SSIS or Powershell, even if it took me weeks to get it right. The real lesson from this crap code is to make things MAINTAINABLE and UNDERSTANDABLE. sed scripting regular expressions doesn't fit that criteria in any way. If you ever find yourself under pressure to do something fast at all costs, DON'T DO IT. Stop and consider long-term maintainability, not just for yourself but for others on your team. If you can't explain the basic approach in under 5 minutes, it ultimately won't succeed. And while you may love to leave all that crap behind, it may follow you anyway, and you'll step in it again. P.S. - if you're wondering about all the manual stuff that couldn't be changed, it was because the entire process had gone through Six Sigma, and was deemed the best possible way. Phew! Talk about stink!

Read the article
Limiting Audit Exposure and Managing Risk – Q&A and Follow-Up Conversation

- by Tanu Sood

Thanks to all who attended the live ISACA webcast on Limiting Audit Exposure and Managing Risk with Metrics-Driven Identity Analytics. We were really fortunate to have Don Sparks from ISACA moderate the webcast featuring Stuart Lincoln, Vice President, IT P&L Client Services, BNP Paribas, North America and Neil Gandhi, Principal Product Manager, Oracle Identity Analytics. Stuart’s insights given the team’s role in providing IT for P&L Client Services and his tremendous experience in identity management and establishing sustainable compliance programs were true value-add at yesterday’s webcast. And if you are a healthcare organization looking to solve your compliance and security challenges, we recommend you join us for a live webcast on Tuesday, November 29 at 10 am PT. The webcast will feature experts from Kaiser Permanente, PricewaterhouseCoopers and Oracle and the focus of the discussion will be around the compliance challenges a healthcare organization faces and best practices for tackling those. Here are the details: Healthcare IT News Webcast: Managing Risk and Enforcing Compliance in Healthcare with Identity Analytics Tuesday, November 29, 201110:00 a.m. PT / 1:00 p.m. ET Register Today The ISACA webcast replay is now available on-demand and the slides are also available for download. Since we didn’t have time to address all the questions we received during the live Q&A portion of the webcast, we have captured responses to the remaining questions here. Please continue to provide us your feedback and insights from your experience in deploying identity compliance solutions. Q. Can you please clarify the mechanism utilized to populate the Identity Warehouse from each individual application's access management function / files? A. Oracle Identity Analytics (OIA) supports direct imports from applications. Data collection is based on Extract, Transform and Load (ETL) that eliminates the need to write connectors to different applications. Oracle Identity Analytics’ import engine supports complex entitlement feeds saved as either text files or XML. The imports can be scheduled on a periodic basis or triggered as needed. If the applications are synchronized with a user provisioning solution like Oracle Identity Manager, Oracle Identity Analytics has a seamless integration to pull in data from Oracle Identity Manager. Q. Can you provide a short summary of the new features in your latest release of Oracle Identity Analytics? A. Oracle recently announced availability of enhanced Oracle Identity Analytics. This release focused on easing the certification process by offering risk analytics driven certification, advanced certification screens, business centric views and significant improvement in performance including 3X faster data imports, 3X faster certification campaign generation and advanced auto-certification features, that will allow organizations to improve user productivity by up to 80%. Closed-loop risk feedback and IT policy monitoring with Oracle Identity Manager, a leading user provisioning solution, allows for more accurate certification reviews. And, OIA's improved performance enables customers to scale compliance initiatives supporting millions of user entitlements across thousands of applications, whether on premise or in the cloud, without compromising speed or integrity. Q. Will ISACA grant a CPE credit for attending this ISACA-sponsored webinar today? A. From ISACA: Hello and thank you for your interest in the 2011 ISACA Webinar Program! Unfortunately, there are no CPEs offered for this program, archived or live. We will be looking into the feasibility of offering them in the future. Q. Would you be able to use this to help manage licenses for software? That is to say - could it track software that is not used by a user, thus eliminating the software license? A. OIA’s integration with Oracle Identity Manager, a leading user provisioning solution, allows organizations to detect ghost accounts or unused accounts via account reconciliation. Based on company’s policies, this could trigger an automated workflow for account deletion or asking for further investigation. Closed-loop feedback between the two solutions would then allow visibility into the complete audit trail of when the account was detected, the action taken, by whom, when and the current status. Q. We have quarterly attestations and .xls mechanisms are not working. Once the identity data is correlated in Identity Analytics, do you then automate access certification? A. OIA’s identity warehouse analyzes and correlates identity data across various resources that allows OIA to determine a user’s risk profile, who the access review request should go to, along with all the relevant access details of the user. The access certification manager gets notification on what to review, when and the relevant data is presented in a business friendly screen. Based on the result of the access certification process, actions are triggered and results recorded and archived. Access review managers have visual risk indicators that also allow them to prioritize access certification tasks and efforts. Q. How does Oracle Identity Analytics work with Cloud Security? A. For enterprises looking to build their own cloud(s), Oracle offers a set of security services that cloud developers can leverage including Oracle Identity Analytics. For enterprises looking to manage their compliance requirements but without hosting those in-house and instead having a hosting provider offer managed Identity Management services to the organizations, Oracle Identity Analytics can be leveraged much the same way as you’d in an on-premise (within the enterprise) environment. In fact, organizations today are leveraging Oracle Identity Analytics to manage identity compliance in both these ways. Q. Would you recommend this as a cost effective solution for a smaller organization with @ 2,500 users? A. The key return-on-investment (ROI) on Oracle Identity Analytics is derived from automating compliance processes thereby eliminating administrative overhead, minimizing errors, maintaining cost- and time-effective sustainable compliance processes and minimizing audit exposures and penalties. Of course, there are other tangible benefits that are derived from an Oracle Identity Analytics implementation as outlined in the webcast. For a quantitative analysis of your requirements and potential ROI calculation, we recommend you refer to the Forrester Study on Total Economic Impact of Oracle Identity Analytics. For an in-person discussion, please email Richard Caldwell.

Read the article
Closing the gap between strategy and execution with Oracle Business Intelligence 11g

- by manan.goel(at)oracle.com

Wikipedia defines strategy as a plan of action designed to achieve a particular goal. An example of this is General Electric's acquisitions and divestiture strategy (plan) designed to propel GE to number 1 or 2 place (goal) in every business segment that it operated in. Execution on the other hand can be defined as the actions taken to getting things done. In GE's case execution will be steps followed for mergers/acquisitions or divestiture. Business press has written extensively about the importance of both strategy and execution in achieving desired business objectives. Perhaps the quote from Thomas Edison says it best - "vision without execution is hallucination". Conversely, it can be said that "execution without vision" is well may be "wishful thinking". Research overwhelmingly point towards the wide gap between strategy and execution. According to a published study, 49% of surveyed executives perceive a gap between their organizations' ability to develop and communicate sound strategies and their ability to implement those strategies. Further, of these respondents, 64% don't have full confidence that their companies will be able to close the gap. Having established the severity and importance of the problem let's talk about the reasons for the strategy-execution gap. The common reasons include: - Lack of clearly defined goals - Lack of consistent measure of success - Lack of ownership - Lack of alignment - Lack of communication - Lack of proper execution - Lack of monitoring There are multiple approaches to solving the problem including organizational development practices, technology enablement etc. In most cases a combination of approaches is required to achieve the desired result. For the purposes of this discussion, I'll focus on technology. Imagine an integrated closed loop technology platform that automates the entire management cycle from defining strategy to assigning ownership to communicating goals to achieving alignment to collaboration to taking actions to monitoring progress and achieving mid course corrections. Besides, for best ROI and lowest TCO such a system should also have characteristics like: Complete - Full functionality - Rich end user access Open - Any data source - Any business application - Any technology stack Integrated - Common metadata - Common security - Common system management From a capabilities perspective the system should provide the following capabilities: Define - Strategy - Objectives - Ownership - KPI's Communicate - Pervasive - Collaborative - Role based - Secure Execute - Integrated - Intuitive - Secure - Ubiquitous Monitor - Multiple styles and formats - Exception based - Push & Pull Having talked about the business problem and outlined the blueprint for a technology solution, let's talk about how Oracle Business Intelligence 11g can help. Oracle Business Intelligence is a comprehensive business intelligence solution for reporting, ad hoc query and analysis, OLAP, dashboards and scorecards. Oracle's best in class BI platform is based on an architecturally integrated technology foundation that provides a unified end user experience and features a Common Enterprise Information Model, with common security, query request generation and optimization, and system management. The BI platform is · Complete - meaning it delivers all modes and styles of BI including reporting, ad hoc query and analysis, OLAP, dashboards and scorecards with a rich end user experience that includes visualization, collaboration, alerts and notifications, search and mobile access. · Open - meaning the BI platform integrates with any data source, ETL tool, business application, application server, security infrastructure, portal technology as well as any ODBC compliant third party analytical tool. The suite accesses data from multiple heterogeneous sources--including popular relational and multidimensional data sources and major ERP and CRM applications from Oracle and SAP. · Integrated - meaning the BI platform is based on an architecturally integrated technology foundation built on an open, standards based service oriented architecture. The platform features a common enterprise information model, common security model and a common configuration, deployment and systems management framework. To summarize, Oracle Business Intelligence is a comprehensive, integrated BI platform that lets you define strategy, identify objectives, assign ownership, define KPI's, collaborate, take action, monitor, report and do course corrections all form a single interface and a single system. The platform's integrated metadata model and task based design ensures that the entire workflow from defining strategy to execution to monitoring is completely integrated delivering end to end visibility, transparency and agility. Click here to learn more about Oracle BI 11g.

Read the article
ODI 12c - Aggregating Data

- by David Allan

This posting will look at the aggregation component that was introduced in ODI 12c. For many ETL tool users this shouldn't be a big surprise, its a little different than ODI 11g but for good reason. You can use this component for composing data with relational like operations such as sum, average and so forth. Also, Oracle SQL supports special functions called Analytic SQL functions, you can use a specially configured aggregation component or the expression component for these now in ODI 12c. In database systems an aggregate transformation is a transformation where the values of multiple rows are grouped together as input on certain criteria to form a single value of more significant meaning - that's exactly the purpose of the aggregate component. In the image below you can see the aggregate component in action within a mapping, for how this and a few other examples are built look at the ODI 12c Aggregation Viewlet here - the viewlet illustrates a simple aggregation being built and then some Oracle analytic SQL such as AVG(EMP.SAL) OVER (PARTITION BY EMP.DEPTNO) built using both the aggregate component and the expression component. In 11g you used to just write the aggregate expression directly on the target, this made life easy for some cases, but it wan't a very obvious gesture plus had other drawbacks with ordering of transformations (agg before join/lookup. after set and so forth) and supporting analytic SQL for example - there are a lot of postings from creative folks working around this in 11g - anything from customizing KMs, to bypassing aggregation analysis in the ODI code generator. The aggregate component has a few interesting aspects. 1. Firstly and foremost it defines the attributes projected from it - ODI automatically will perform the grouping all you do is define the aggregation expressions for those columns aggregated. In 12c you can control this automatic grouping behavior so that you get the code you desire, so you can indicate that an attribute should not be included in the group by, that's what I did in the analytic SQL example using the aggregate component. 2. The component has a few other properties of interest; it has a HAVING clause and a manual group by clause. The HAVING clause includes a predicate used to filter rows resulting from the GROUP BY clause. Because it acts on the results of the GROUP BY clause, aggregation functions can be used in the HAVING clause predicate, in 11g the filter was overloaded and used for both having clause and filter clause, this is no longer the case. If a filter is after an aggregate, it is after the aggregate (not sometimes after, sometimes having). 3. The manual group by clause let's you use special database grouping grammar if you need to. For example Oracle has a wealth of highly specialized grouping capabilities for data warehousing such as the CUBE function. If you want to use specialized functions like that you can manually define the code here. The example below shows the use of a manual group from an example in the Oracle database data warehousing guide where the SUM aggregate function is used along with the CUBE function in the group by clause. The SQL I am trying to generate looks like the following from the data warehousing guide; SELECT channel_desc, calendar_month_desc, countries.country_iso_code, TO_CHAR(SUM(amount_sold), '9,999,999,999') SALES$ FROM sales, customers, times, channels, countries WHERE sales.time_id=times.time_id AND sales.cust_id=customers.cust_id AND sales.channel_id= channels.channel_id AND customers.country_id = countries.country_id AND channels.channel_desc IN ('Direct Sales', 'Internet') AND times.calendar_month_desc IN ('2000-09', '2000-10') AND countries.country_iso_code IN ('GB', 'US') GROUP BY CUBE(channel_desc, calendar_month_desc, countries.country_iso_code); I can capture the source datastores, the filters and joins using ODI's dataset (or as a traditional flow) which enables us to incrementally design the mapping and the aggregate component for the sum and group by as follows; In the above mapping you can see the joins and filters declared in ODI's dataset, allowing you to capture the relationships of the datastores required in an entity-relationship style just like ODI 11g. The mix of ODI's declarative design and the common flow design provides for a familiar design experience. The example below illustrates flow design (basic arbitrary ordering) - a table load where only the employees who have maximum commission are loaded into a target. The maximum commission is retrieved from the bonus datastore and there is a look using employees as the driving table and only those with maximum commission projected. Hopefully this has given you a taster for some of the new capabilities provided by the aggregate component in ODI 12c. In summary, the actions should be much more consistent in behavior and more easily discoverable for users, the use of the components in a flow graph also supports arbitrary designs and the tool (rather than the interface designer) takes care of the realization using ODI's knowledge modules. Interested to know if a deep dive into each component is interesting for folks. Any thoughts?

Read the article
Maximize Performance and Availability with Oracle Data Integration

- by Tanu Sood

Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-fareast-font-family:Calibri; mso-bidi-font-family:"Times New Roman";} Alert: Oracle is hosting the 12c Launch Webcast for Oracle Data Integration and Oracle Golden Gate on Tuesday, November 12 (tomorrow) to discuss the new capabilities in detail and share customer perspectives. Hear directly from customer experts and executives from SolarWorld Industries America, British Telecom and Rittman Mead and get your questions answered live by product experts. Register for this complimentary webcast today and join in the discussion tomorrow. Author: Irem Radzik, Senior Principal Product Director, Oracle Organizations that want to use IT as a strategic point of differentiation prefer Oracle’s complete application offering to drive better business performance and optimize their IT investments. These enterprise applications are in the center of business operations and they contain critical data that needs to be accessed continuously, as well as analyzed and acted upon in a timely manner. These systems also need to operate with high-performance and availability, which means analytical functions should not degrade applications performance, and even system maintenance and upgrades should not interrupt availability. Oracle’s data integration products, Oracle Data Integrator, Oracle GoldenGate, and Oracle Enterprise Data Quality, provide the core foundation for bringing data from various business-critical systems to gain a broader, unified view. As a more advance offering to 3rd party products, Oracle’s data integration products facilitate real-time reporting for Oracle Applications without impacting application performance, and provide ability to upgrade and maintain the system without taking downtime. Oracle GoldenGate is certified for Oracle Applications, including E-Business Suite, Siebel CRM, PeopleSoft, and JD Edwards, for moving transactional data in real-time to a dedicated operational reporting environment. This solution allows the app users to offload the resource-heavy queries to the reporting instance(s), reducing CPU utilization, improving OLTP performance, and extending the lifetime of existing IT assets. In addition, having a dedicated reporting instance with up-to-the-second transactional data allows optimizing the reporting environment and even decreasing costs as GoldenGate can move only the required data from expensive mainframe environments to cost-efficient open system platforms. With real-time data replication capabilities GoldenGate is also certified to enable application upgrades and database/hardware/OS migration without impacting business operations. GoldenGate is certified for Siebel CRM, Communications Billing and Revenue Management and JD Edwards for supporting zero downtime upgrades to the latest app version. GoldenGate synchronizes a parallel, upgraded system with the old version in real time, thus enables continuous operations during the process. Oracle GoldenGate is also certified for minimal downtime database migrations for Oracle E-Business Suite and other key applications. GoldenGate’s solution also minimizes the risk by offering a failback option after the switchover to the new environment. Furthermore, Oracle GoldenGate’s bidirectional active-active data replication is certified for Oracle ATG Web Commerce to enable geographically load balancing and high availability for ATG customers. For enabling better business insight, Oracle Data Integration products power Oracle BI Applications with high performance bulk and real-time data integration. Oracle Data Integrator (ODI) is embedded in Oracle BI Applications version 11.1.1.7.1 and helps to integrate data end-to-end across the full BI Applications architecture, supporting capabilities such as data-lineage, which helps business users identify report-to-source capabilities. ODI is integrated with Oracle GoldenGate and provides Oracle BI Applications customers the option to use real-time transactional data in analytics, and do so non-intrusively. By using Oracle GoldenGate with the latest release of Oracle BI Applications, organizations not only leverage fresh data in analytics, but also eliminate the need for an ETL batch window and minimize the impact on OLTP systems. You can learn more about Oracle Data Integration products latest 12c version in our upcoming launch webcast and access the app-specific free resources in the new Data Integration for Oracle Applications Resource Center.

Read the article
Real Time BI in the Real World

- by tobin.gilman(at)oracle.com

Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman","serif";} Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman","serif"; mso-fareast-font-family:"Times New Roman";} One of my favorite BI offerings from Oracle is a solution called Oracle Real Time Decisions. Whenever I mention this product in customer meetings, eyes light up. There are some fascinating examples of customers using it to up-sell, cross-sell, increase customer retention, and reduce risk in real time, with off the charts return on investment. I plan to share some of those stories in a future blog. In this post however, I want to share some far more common real time analytics use case scenarios that are being addressed with widely deployed Oracle BI and data integration technologies Not all real time BI applications require continuous learning, predictive modeling, and data mining. Many simply require the ability to integrate, aggregate, and access information that is current (typically within in few minutes or a few seconds). The use cases are infinite. A few I've seen: · Purchasing agents need to match demand against available inventory · Manufacturing planners need to monitor current parts and material against scheduled build plans · Airline agents need to match ticket demand against flight schedules, · Human resources managers need to track the status of global hiring requisitions against current headcount authorizations...you get the idea. One way of doing this is to run reports or federated queries directly against transactional systems. That approach can be viable if you only need to access simple data sets on rare occasions. High volume and complex queries can quickly bog down performance of mission critical transactional systems. There is an architecturally simple way of solving the problem, and it's being applied by real companies around the world to solve real needs in real time. Cbeyond is an Atlanta, GA based provider of voice, data and mobile business applications delivers. They deliver real time information to its call center agents as they are interacting with their customers. The data they need resides in production CRM and other transactional systems, but instead or reporting directly off the those systems, data is first moved to an operational data store (ODS). Rather than running data intensive, time consuming, and performance degrading batch ETL routines to populate the ODS, Cbeyond uses Oracle Golden Gate software to incrementally capture and move only the changed records from log files of the transactional systems every few minutes. There is no impact on transactional system performance, and the information needed by call center representatives is up to date. Oracle Business Intelligence software presents the information to services reps in a rich, visual, and highly interactive format. Avea is similar to Cbeyond. They are a telecommunications company who integrates billing and customer information in an ODS that is accessed by their call center agents in real time using Oracle Golden Gate and Oracle Business Intelligence. They've taken it a step further by using the ODS to feed a data warehouse. The operational data store provides the current information needed by call center agents during "in flight" customer interactions. The data warehouse is used for more sophisticated analysis of historical data. For maximum performance, both the ODS and data warehouse run on the Oracle Exadata Database Machine. These are practical illustrations of companies addressing real time reporting and analysis needs using established business intelligence/data warehousing methodologies and tools common to many IT departments. If real time BI could benefit your organization, you may be already be closer than you thought to having the pieces in place to solving the problem. Give us a shout if you are interested in learning more or if you have an interesting use or approach to real-time BI.

Read the article
What Counts For a DBA: Fitness

- by Louis Davidson

If you know me, you can probably guess that physical exercise is not really my thing. There was a time in my past when it a larger part of my life, but even then never in the same sort of passionate way as a number of our SQL friends. For me, I find that mental exercise satisfies what I believe to be the same inner need that drives people to run farther than I like to drive on most Saturday mornings, and it is certainly just as addictive. Mental fitness shares many common traits with physical fitness, especially the need to attain it through repetitive training. I only wish that mental training burned off a bacon cheeseburger in the same manner as does jogging around a dewy park on Saturday morning. In physical training, there are at least two goals, the first of which is to be physically able to do a task. The second is to train the brain to perform the task without thinking too hard about it. No matter how long it has been since you last rode a bike, you will be almost certainly be able to hop on and start riding without thinking about the process of pedaling or balancing. If you’ve never ridden a bike, you could be a physics professor /Olympic athlete and still crash the first few times you try, even though you are as strong as an ox and your knowledge of the physics of bicycle riding makes the concept child’s play. For programming tasks, the process is very similar. As a DBA, you will come to know intuitively how to backup, optimize, and secure database systems. As a data programmer, you will work to instinctively use the clauses of Transact-SQL DML so that, when you need to group data three ways (and not four), you will know to use the GROUP BY clause with GROUPING SETS without resorting to a search engine. You have the skill. Making it naturally then requires repetition and experience is the primary requirement, not just simply learning about a topic. The hardest part of being really good at something is this difference between knowledge and skill. I have recently taken several informative training classes with Kimball University on data warehousing and ETL. Now I have a lot more knowledge about designing data warehouses than before. I have also done a good bit of data warehouse designing of late and have started to improve to some level of proficiency with the theory. Yet, for all of this head knowledge, it is still a struggle to take what I have learned and apply it to the designs I am working on. Data warehousing is still a task that is not yet deeply ingrained in my brain muscle memory. On the other hand, relational database design is something that no matter how much or how little I may get to do it, I am comfortable doing it. I have done it as a profession now for well over a decade, I teach classes on it, and I also have done (and continue to do) a lot of mental training beyond the work day. Sometimes the training is just basic education, some reading blogs and attending sessions at PASS events. My best training comes from spending time working on other people’s design issues in forums (though not nearly as much as I would like to lately). Working through other people’s problems is a great way to exercise your brain on problems with which you’re not immediately familiar. The final bit of exercise I find useful for cultivating mental fitness for a data professional is also probably the nerdiest thing that I will ever suggest you do. Akin to running in place, the idea is to work through designs in your head. I have designed more than one database system that would revolutionize grocery store operations, sales at my local Target store, the ordering process at Amazon, and ways to improve Disney World operations to get me through a line faster (some of which they are starting to implement without any of my help.) Never are the designs truly fleshed out, but enough to work through structures and processes. On “paper”, I have designed database systems to catalog things as trivial as my Lego creations, rental car companies and my audio and video collections. Once I get the database designed mentally, sometimes I will create the database, add some data (often using Red-Gate’s Data Generator), and write a few queries to see if a concept was realistic, but I will rarely fully flesh out the database since I have no desire to do any user interface programming anymore. The mental training allows me to keep in practice for when the time comes to do the work I love the most for real…even if I have been spending most of my work time lately building data warehouses. If you are really strong of mind and body, perhaps you can mix a mental run with a physical run; though don’t run off of a cliff while contemplating how you might design a database to catalog the trees on a mountain…that would be contradictory to the purpose of both types of exercise.

Read the article
OBIA on Teradata - Part 1 Loader and Monitoring

- by Mohan Ramanuja

Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} The out-of-the-box (OOB) OBIA Informatica mappings come with TPump loader. TPUMP FASTLOAD TPump does not lock the table. FastLoad applies exclusive lock on the table. The table that TPump is loading can have data. The table that FastLoad is loading needs to be empty. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} TPump is not efficient with lookups. FastLoad is more efficient in the absence of lookups. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} The out-of the box Informatica mappings come with TPump loader. There is chance for bottleneck in writer thread The out-of the box tables in Teradata supplied with OBAW features all Dimension and Fact tables using ROW_WID as the key for primary index. Also, all staging tables use integration_id as the key for primary index. This reduces skewing of data across Teradata AMPs.You can use an SQL statement similar to the following to determine if data for a given table is distributed evenly across all AMP vprocs. The SQL statement displays the AMP with the most used through the AMP with the least-used space, investigating data distribution in the Message table in database RST.SELECT vproc,CurrentPermFROM DBC.TableSizeWHERE Databasename = ‘PRJ_CRM_STGC’AND Tablename = ‘w_party_per_d’ORDER BY 2 descIf you suspect distribution problems (skewing) among AMPS, the following is a sample of what you might enter for a three-column PI:SELECT HASHAMP (HASHBUCKET (HASHROW (col_x, col_y, col_z))), count (*)FROM hash15GROUP BY 1ORDER BY 2 desc; ETL Error Monitoring Error Table – These are tables that start with ET. Location and name can be specified in Informatica session as well as the loader connection.Loader Log – Loader log is available in the Informatica server under the session log folder. These give feedback on the loader parameters such as Packing Factor to use. These however need to be monitored in the production environment. The recommendations made in one environment may not be used in another environment.Log Table – These are tables that start with TL. These are sparse on information.Bad File – This is the Informatica file generated in case there is data quality issues

Read the article
Right-Time Retail Part 2

- by David Dorf

This is part two of the three-part series. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Right-Time Integration Of course these real-time enabling technologies are only as good as the systems that utilize them, and it only takes one bottleneck to slow everyone else down. What good is an immediate stock-out notification if the supply chain can’t react until tomorrow? Since being formed in 2006, Oracle Retail has been not only adding more integrations between systems, but also modernizing integrations for appropriate speed. Notice I tossed in the word “appropriate.” Not everything needs to be real-time – again, we’re talking about Right-Time Retail. The speed of data capture, analysis, and execution must be synchronized or you’re wasting effort. Unfortunately, there isn’t an enterprise-wide dial that you can crank-up for your estate. You’ll need to improve things piecemeal, with people and processes as limiting factors while choosing the appropriate types of integrations. There are three integration styles we see in the retail industry. First is batch. I know, the word “batch” just sounds slow, but this pattern is less about velocity and more about volume. When there are large amounts of data to be moved, you’ll want to use batch processes. Our technology of choice here is Oracle Data Integrator (ODI), which provides a fast version of Extract-Transform-Load (ETL). Instead of the three-step process, the load and transform steps are combined to save time. ODI is a key technology for moving data into Retail Analytics where we can apply science. Performing analytics on each sale as it occurs doesn’t make any sense, so we batch up a statistically significant amount and submit all at once. The second style is fire-and-forget. For some types of data, we want the data to arrive ASAP but immediacy is not necessary. Speed is less important than guaranteed delivery, so we use message-oriented middleware available in both Weblogic and the Oracle database. For example, Point-of-Service transactions are queued for delivery to Central Office at corporate. If the network is offline, those transactions remain in the queue and will be delivered when the network returns. Transactions cannot be lost and they must be delivered in order. (Ever tried processing a return before the sale?) To enhance the standard queues, we offer the Retail Integration Bus (RIB) to help the management and monitoring of fire-and-forget messaging in the enterprise. The third style is request-response and is most commonly implemented as Web services. This is a synchronous message where the sender waits for a response. In this situation, the volume of data is small, guaranteed delivery is not necessary, but speed is very important. Examples include the website checking inventory, a price lookup, or processing a credit card authorization. The Oracle Service Bus (OSB) typically handles the routing of such messages, and we’ve enhanced its abilities with the Retail Service Backbone (RSB). To better understand these integration patterns and where they apply within the retail enterprise, we’re providing the Retail Reference Library (RRL) at no charge to Oracle Retail customers. The library is composed of a large number of industry business processes, including those necessary to support Commerce Anywhere, as well as detailed architectural diagrams. These diagrams allow implementers to understand the systems involved in integrations and the specific data payloads. Furthermore, with our upcoming release we’ll be providing a new tool called the Retail Integration Console (RIC) that allows IT to monitor and manage integrations from a single point. Using RIC, retailers can quickly discern where integration activity is occurring, volume statistics, average response times, and errors. The dashboards provide the ability to dive down into the architecture documentation to gather information all the way down to the specific payload. Retailers that want real-time integrations will also need real-time monitoring of those integrations to ensure service-level agreements are maintained. Part 3 looks at marketing.

Read the article
Database Partitioning and Multiple Data Source Considerations

- by Jeffrey McDaniel

With the release of P6 Reporting Database 3.0 partitioning was added as a feature to help with performance and data management. Careful investigation of requirements should be conducting prior to installation to help improve overall performance throughout the lifecycle of the data warehouse, preventing future maintenance that would result in data loss. Before installation try to determine how many data sources and partitions will be required along with the ranges. In P6 Reporting Database 3.0 any adjustments outside of defaults must be made in the scripts and changes will require new ETL runs for each data source. Considerations: 1. Standard Edition or Enterprise Edition of Oracle Database. If you aren't using Oracle Enterprise Edition Database; the partitioning feature is not available. Multiple Data sources are only supported on Enterprise Edition of Oracle Database. 2. Number of Data source Ids for partitioning during configuration. This setting will specify how many partitions will be allocated for tables containing data source information. This setting requires some evaluation prior to installation as there are repercussions if you don't estimate correctly. For example, if you configured the software for only 2 data sources and the partition setting was set to 2, however along came a 3rd data source. The necessary steps to accommodate this change are as follows: a) By default, 3 partitions are configured in the Reporting Database scripts. Edit the create_star_tables_part.sql script located in <installation directory>\star\scripts and search for partition. You’ll see P1, P2, P3. Add additional partitions and sub-partitions for P4 and so on. These will appear in several areas. (See P6 Reporting Database 3.0 Installation and Configuration guide for more information on this and how to adjust partition ranges). b) Run starETL -r. This will recreate each table with the new partition key. The effect of this step is that all tables data will be lost except for history related tables. c) Run starETL for each of the 3 data sources (with the data source # (starETL.bat "-s2" -as defined in P6 Reporting Database 3.0 Installation and Configuration guide) The best strategy for this setting is to overestimate based on possible growth. If during implementation it is deemed that there are atleast 2 data sources with possibility for growth, it is a better idea to set this setting to 4 or 5, allowing room for the future and preventing a ‘start over’ scenario. 3. The Number of Partitions and the Number of Months per Partitions are not specific to multi-data source. These settings work in accordance to a sub partition of larger tables with regard to time related data. These settings are dataset specific for optimization. The number of months per partition is self explanatory, optimally the smaller the partition, the better query performance so if the dataset has an extremely large number of spread/history records, a lower number of months is optimal. Working in accordance with this setting is the number of partitions, this will determine how many "buckets" will be created per the number of months setting. For example, if you kept the default for # of partitions of 3, and select 2 months for each partitions you would end up with: -1st partition, 2 months -2nd partition, 2 months -3rd partition, all the remaining records Therefore with records to this setting, it is important to analyze your source db spread ranges and history settings when determining the proper number of months per partition and number of partitions to optimize performance. Also be aware the DBA will need to monitor when these partition ranges will fill up and when additional partitions will need to be added. If you get to the final range partition and there are no additional range partitions all data will be included into the last partition.

Read the article
RAD Visual Web Application Creator/ Builder/ Designer for PHP

- by inhoue

Hi all, I want to see if any of you know a (free and open source will be ideal) tool/ app that can help build a php web application very quickly without investing too much time on writing codes, preferring drag and drop/ point and click work-flow designer for logic design (see Agile from Outsystems below). Plus, visual designer for the business logic is great since it can help a developer visualize the logic better. There are a lot of GUI builders, form builders out there, but I am looking for one app for the entire web application development process. My goal is to find an application that a team of developers can use together and use the build-in code of the app as much as possible. E.g. the app will provide a modular just for handle user login or a shopping cart; a developer just need to drag and drop the modular to the logic designer and the code will be generated. This way the functionality will be in a module and code will always be standard across developers. So if a new developer get on-board, he will just need to use the system and get up and running quickly. To explain this better: there is a lot php frameworks, e.g. cakephp, CodeIgniter, etc which I can use to help coding, but still I need to create (code) the GUI, writing quite a bit of codes. I am looking for a tool/ app that is a little more high level than those frameworks. Here is 2 examples apps I found during my google search which they have visual logic designer and gui builder in one single app. Also a single click deployment (but I need it to be php apps or at least I can deploy the (php) code to a LAMP/ WAMP server): Wavemaker: for JAVA Agile from Outsystems: for JAVA or .net (This one is really good, with work-flow drag and drop logic designer!) Talend: it is just an ETL tool, but the concept is what I want to bring up. Drag and drop, point and click logic design. Custom code can be added if it is needed, but the drag and drop process already finished the structure and most of the coding of the web app one needs to build. I want to list Adobe Flex, but it is more like a GUI designer + IDE, not exactly what I want to describe here. The drag and drop/ work-flow logic designer is a key for the app. I could go for the CMS route by learning how to extend them, but it is not that flexible for me and is a long learning curve. Anybody came across this type of app before? Or any idea of how can I find those apps? I googled them for long time, I don't see any of them for php and just few (just 2) for Java. Thanks in advance!

Read the article
ODI 12c - Parallel Table Load

- by David Allan

In this post we will look at the ODI 12c capability of parallel table load from the aspect of the mapping developer and the knowledge module developer - two quite different viewpoints. This is about parallel table loading which isn't to be confused with loading multiple targets per se. It supports the ability for ODI mappings to be executed concurrently especially if there is an overlap of the datastores that they access, so any temporary resources created may be uniquely constructed by ODI. Temporary objects can be anything basically - common examples are staging tables, indexes, views, directories - anything in the ETL to help the data integration flow do its job. In ODI 11g users found a few workarounds (such as changing the technology prefixes - see here) to build unique temporary names but it was more of a challenge in error cases. ODI 12c mappings by default operate exactly as they did in ODI 11g with respect to these temporary names (this is also true for upgraded interfaces and scenarios) but can be configured to support the uniqueness capabilities. We will look at this feature from two aspects; that of a mapping developer and that of a developer (of procedures or KMs). 1. Firstly as a Mapping Developer..... 1.1 Control when uniqueness is enabled A new property is available to set unique name generation on/off. When unique names have been enabled for a mapping, all temporary names used by the collection and integration objects will be generated using unique names. This property is presented as a check-box in the Property Inspector for a deployment specification. 1.2 Handle cleanup after successful execution Provided that all temporary objects that are created have a corresponding drop statement then all of the temporary objects should be removed during a successful execution. This should be the case with the KMs developed by Oracle. 1.3 Handle cleanup after unsuccessful execution If an execution failed in ODI 11g then temporary tables would have been left around and cleaned up in the subsequent run. In ODI 12c, KM tasks can now have a cleanup-type task which is executed even after a failure in the main tasks. These cleanup tasks will be executed even on failure if the property 'Remove Temporary Objects on Error' is set. If the agent was to crash and not be able to execute this task, then there is an ODI tool (OdiRemoveTemporaryObjects here) you can invoke to cleanup the tables - it supports date ranges and the like. That's all there is to it from the aspect of the mapping developer it's much, much simpler and straightforward. You can now execute the same mapping concurrently or execute many mappings using the same resource concurrently without worrying about conflict. 2. Secondly as a Procedure or KM Developer..... In the ODI Operator the executed code shows the actual name that is generated - you can also see the runtime code prior to execution (introduced in 11.1.1.7), for example below in the code type I selected 'Pre-executed Code' this lets you see the code about to be processed and you can also see the executed code (which is the default view). References to the collection (C$) and integration (I$) names will be automatically made unique by using the odiRef APIs - these objects will have unique names whenever concurrency has been enabled for a particular mapping deployment specification. It's also possible to use name uniqueness functions in procedures and your own KMs. 2.1 New uniqueness tags You can also make your own temporary objects have unique names by explicitly including either %UNIQUE_STEP_TAG or %UNIQUE_SESSION_TAG in the name passed to calls to the odiRef APIs. Such names would always include the unique tag regardless of the concurrency setting. To illustrate, let's look at the getObjectName() method. At <% expansion time, this API will append %UNIQUE_STEP_TAG to the object name for collection and integration tables. The name parameter passed to this API may contain %UNIQUE_STEP_TAG or %UNIQUE_SESSION_TAG. This API always generates to the <? version of getObjectName() At execution time this API will replace the unique tag macros with a string that is unique to the current execution scope. The returned name will conform to the name-length restriction for the target technology, and its pattern for the unique tag. Any necessary truncation will be performed against the initial name for the object and any other fixed text that may have been specified. Examples are:- <?=odiRef.getObjectName("L", "%COL_PRFEMP%UNIQUE_STEP_TAG", "D")?> SCOTT.C$_EABH7QI1BR1EQI3M76PG9SIMBQQ <?=odiRef.getObjectName("L", "EMP%UNIQUE_STEP_TAG_AE", "D")?> SCOTT.EMPAO96Q2JEKO0FTHQP77TMSAIOSR_ Methods which have this kind of support include getFrom, getTableName, getTable, getObjectShortName and getTemporaryIndex. There are APIs for retrieving this tag info also, the getInfo API has been extended with the following properties (the UNIQUE* properties can also be used in ODI procedures); UNIQUE_STEP_TAG - Returns the unique value for the current step scope, e.g. 5rvmd8hOIy7OU2o1FhsF61 Note that this will be a different value for each loop-iteration when the step is in a loop. UNIQUE_SESSION_TAG - Returns the unique value for the current session scope, e.g. 6N38vXLrgjwUwT5MseHHY9 IS_CONCURRENT - Returns info about the current mapping, will return 0 or 1 (only in % phase) GUID_SRC_SET - Returns the UUID for the current source set/execution unit (only in % phase) The getPop API has been extended with the IS_CONCURRENT property which returns info about an mapping, will return 0 or 1. 2.2 Additional APIs Some new APIs are provided including getFormattedName which will allow KM developers to construct a name from fixed-text or ODI symbols that can be optionally truncate to a max length and use a specific encoding for the unique tag. It has syntax getFormattedName(String pName[, String pTechnologyCode]) This API is available at both the % and the ? phase. The format string can contain the ODI prefixes that are available for getObjectName(), e.g. %INT_PRF, %COL_PRF, %ERR_PRF, %IDX_PRF alongwith %UNIQUE_STEP_TAG or %UNIQUE_SESSION_TAG. The latter tags will be expanded into a unique string according to the specified technology. Calls to this API within the same execution context are guaranteed to return the same unique name provided that the same parameters are passed to the call. e.g. <%=odiRef.getFormattedName("%COL_PRFMY_TABLE%UNIQUE_STEP_TAG_AE", "ORACLE")%> <?=odiRef.getFormattedName("%COL_PRFMY_TABLE%UNIQUE_STEP_TAG_AE", "ORACLE")?> C$_MY_TAB7wDiBe80vBog1auacS1xB_AE <?=odiRef.getFormattedName("%COL_PRFMY_TABLE%UNIQUE_STEP_TAG.log", "FILE")?> C2_MY_TAB7wDiBe80vBog1auacS1xB.log 2.3 Name length generation As part of name generation, the length of the generated name will be compared with the maximum length for the target technology and truncation may need to be applied. When a unique tag is included in the generated string it is important that uniqueness is not compromised by truncation of the unique tag. When a unique tag is NOT part of the generated name, the name will be truncated by removing characters from the end - this is the existing 11g algorithm. When a unique tag is included, the algorithm will first truncate the <postfix> and if necessary the <prefix>. It is recommended that users will ensure there is sufficient uniqueness in the <prefix> section to ensure uniqueness of the final resultant name. SUMMARY To summarize, ODI 12c make it much simpler to utilize mappings in concurrent cases and provides APIs for helping developing any procedures or custom knowledge modules in such a way they can be used in highly concurrent, parallel scenarios.

Read the article
Blogging tips for SQL Server professionals

- by jamiet

For some time now I have been intending to put some material together relating my blogging experiences since I began blogging in 2004 and that led to me submitting a session for SQLBits recently where I intended to do just that. That didn’t get enough votes to allow me to present however so instead I resolved to write a blog post about it and Simon Sabin’s recent post Blogging – how do you do it? has prompted me to get around to completing it. So, here I present a compendium of tips that I’ve picked up from authoring a fair few blog posts over the past 6 years. Feedburner Feedburner.com is a service that can consume your blog’s default RSS feed and provide another, replacement, feed that has exactly the same content. You can then supply that replacement feed on your blog site for other people to consume in their RSS readers. Why would you want to do this? Well, two reasons actually: It makes your blog portable. If you ever want to move your blog to a different URL you don’t have to tell your subscribers to move to a different feed. The feedburner feed is a pointer to your blog content rather than being a copy of it. Feedburner will collect stats telling you how many people are subscribed to your feed, which RSS readers they use, stuff like that. Here’s a sample screenshot for http://sqlblog.com/blogs/jamie_thomson/: It also tells you what your most viewed posts are: Web stats like these are notoriously inaccurate but then again the method of measurement here is not important, what IS important is that it gives you a trustworthy ranking of your blog posts and (in my opinion) knowing which are your most popular posts is more important than knowing exactly how many views each post has had. This is just the tip of the iceberg of what Feedburner provides and I recommend every new blogger to try it! Monitor subscribers using Google Reader If for some reason Feedburner is not to your taste or (more likely) you already have an established RSS feed that you do not want to change then Google provide another way in which you can monitor your readership in the shape of their online RSS reader, Google Reader. It provides, for every RSS feed, a collection of stats including the number of Google Reader users that have subscribed to that RSS feed. This is really valuable information and in fact I have been recording this statistic for mine and a number of other blogs for a few years now and as such I can produce the following chart that indicates how readership is trending for those blogs over time: [Good news for my fellow SQLBlog bloggers.] As Stephen Few readily points out, its not the numbers that are important but the trend. Search Engine Optimisation (SEO) SEO (or “How do I get my blog to show up in Google”) is a massive area of expertise which I don’t want (and am unable) to cover in much detail here but there are some simple rules of thumb that will help: Tags – If your blog engine offers the ability to add tags to your blog post, use them. Invariably those tags go into the meta section of the page HTML and search engines lap that stuff up. For example, from my recent post Microsoft publish Visual Studio 2010 Database Project Guidance: Title – Search engines take notice of web page titles as well so make them specific and descriptive (e.g. “Configuring dtsConfig connection strings”) rather than esoteric and meaningless in a vain attempt to be humorous (e.g. “Last night a DJ saved my ETL batch”)! Title(2) – Make your title even more search engine friendly by mentioning high level subject areas, not dissimilar to Twitter hashtags. For example, if you look at all of my posts related to SSIS you will notice that nearly all contain the word “SSIS” in the title even if I had to shoehorn it in there by putting it in square brackets or similar. Another tip, if you ARE putting words into your titles in this artificial manner then put them at the end so that they’re not that prominent in search engine results; they’re there for the search engines to consume, not for human beings. Images – Always add titles and alternate text (ALT attribute) to images in your blog post. If you use Windows 7 or Windows Vista then you can use Live Writer (which Simon recommended) makes this easy for you. Headings – If you want to highlight section headings use heading tags (e.g. <H1>, <H2>, <H3> etc…) rather than just formatting the text appropriately – again, Live makes this easy. These tags give your blog posts structure that is understood by search engines and RSS readers alike. (I believe it makes them more amenable to CSS as well – though that’s not something I know too much about). If you check the HTML source for the blog post you’re reading right now you’ll be able to scan through and see where I have used heading tags. Microsoft provide a free tool called the SEO Toolkit that will analyse your blog site (for free) and tell you what things you should change to improve SEO. Go read more and download for free at Search Engine Optimization Toolkit. Did I mention that it was free? Miscellaneous Tips If you are including code in your blog post then ensure it is formatted correctly. Use SQL Server Central’s T-SQL prettifier for formatting T-SQL code. Use images and videos. Personally speaking there’s nothing I like less when reading a blog than paragraph after paragraph of text. Images make your blog more appealing which means people are more likely to read what you have written. Be original. Don’t plagiarise other people’s content and don’t simply rewrite the contents of Books Online. Every time you publish a blog post tweet a link to it. Include hashtags in your tweet that are more likely to grab people’s attention. That’s probably enough for now - I hope this blog post proves useful to someone out there. If you would appreciate a related session at a forthcoming SQLBits conference then please let me know. This will likely be my last blog post for 2010 so I would like to take this opportunity to thank everyone that has commented on, linked to or read any of my blog posts in that time. 2011 is shaping up to be a very interesting for SQL Server observers with the impending release of SQL Server code-named Denali and I promise I’ll have lots more content on that as the year progresses. Happy New Year. @Jamiet

Read the article

Search Results

Search found 266 results on 11 pages for 'etl'.

Page 10/11 | < Previous Page | 6 7 8 9 10 11 | Next Page >

- by jamiet

- by Jean-Pierre Dijcks

- by Pinal Dave

- by BuckWoody

- by Kaleb Pederson

- by Marco Russo (SQLBI)

- by Mike Femenella

- by generalt

- by EdgarVerona

- by Marcus Swope

- by Steve Guidi

- by TWood

- by Most Valuable Yak (Rob Volk)

- by Tanu Sood

- by manan.goel(at)oracle.com

- by David Allan

- by Tanu Sood

- by tobin.gilman(at)oracle.com

- by Louis Davidson

- by Mohan Ramanuja

- by David Dorf

- by Jeffrey McDaniel

- by inhoue

- by David Allan

- by jamiet

< Previous Page | 6 7 8 9 10 11 | Next Page >