Search Results

Search found 3797 results on 152 pages for 'talk'.

Page 34/152 | < Previous Page | 30 31 32 33 34 35 36 37 38 39 40 41 | Next Page >

Microsoft Access as a Weapon of War

- by Damon Armstrong

A while ago (probably a decade ago, actually) I saw a report on a tracking system maintained by a U.S. Army artillery control unit. This system was capable of maintaining a bearing on various units in the field to help avoid friendly fire. I consider the U.S. Army to be the most technologically advanced fighting force on Earth, but to my terror I saw something on the title bar of an application displayed on a laptop behind one of the soldiers they were interviewing: Tracking.mdb Oh yes. Microsoft Office Suite had made it onto the battlefield. My hope is that it was just running as a front-end for a more proficient database (no offense Access people), or that the soldier was tracking something else like KP duty or fantasy football scores. But I could also see the corporate equivalent of a pointy-haired boss walking into a cube and asking someone who had piddled with Access to build a database for HR forms. Except this pointy-haired boss would have been a general, the cube would have been a tank, and the HR forms would have been targets that, if something went amiss, would have been hit by a 500lb artillery round. Hope that solider could write a good query

Read the article
Microsoft Access as a Weapon of War

- by Damon

A while ago (probably a decade ago, actually) I saw a report on a tracking system maintained by a U.S. Army artillery control unit. This system was capable of maintaining a bearing on various units in the field to help avoid friendly fire. I consider the U.S. Army to be the most technologically advanced fighting force on Earth, but to my terror I saw something on the title bar of an application displayed on a laptop behind one of the soldiers they were interviewing: Tracking.mdb Oh yes. Microsoft Office Suite had made it onto the battlefield. My hope is that it was just running as a front-end for a more proficient database (no offense Access people), or that the soldier was tracking something else like KP duty or fantasy football scores. But I could also see the corporate equivalent of a pointy-haired boss walking into a cube and asking someone who had piddled with Access to build a database for HR forms. Except this pointy-haired boss would have been a general, the cube would have been a tank, and the HR forms would have been targets that, if something went amiss, would have been hit by a 500lb artillery round. Hope that solider could write a good query :)

Read the article
Subterranean IL: The ThreadLocal type

- by Simon Cooper

I came across ThreadLocal<T> while I was researching ConcurrentBag. To look at it, it doesn't really make much sense. What's all those extra Cn classes doing in there? Why is there a GenericHolder<T,U,V,W> class? What's going on? However, digging deeper, it's a rather ingenious solution to a tricky problem. Thread statics Declaring that a variable is thread static, that is, values assigned and read from the field is specific to the thread doing the reading, is quite easy in .NET: [ThreadStatic] private static string s_ThreadStaticField; ThreadStaticAttribute is not a pseudo-custom attribute; it is compiled as a normal attribute, but the CLR has in-built magic, activated by that attribute, to redirect accesses to the field based on the executing thread's identity. TheadStaticAttribute provides a simple solution when you want to use a single field as thread-static. What if you want to create an arbitary number of thread static variables at runtime? Thread-static fields can only be declared, and are fixed, at compile time. Prior to .NET 4, you only had one solution - thread local data slots. This is a lesser-known function of Thread that has existed since .NET 1.1: LocalDataStoreSlot threadSlot = Thread.AllocateNamedDataSlot("slot1"); string value = "foo"; Thread.SetData(threadSlot, value); string gettedValue = (string)Thread.GetData(threadSlot); Each instance of LocalStoreDataSlot mediates access to a single slot, and each slot acts like a separate thread-static field. As you can see, using thread data slots is quite cumbersome. You need to keep track of LocalDataStoreSlot objects, it's not obvious how instances of LocalDataStoreSlot correspond to individual thread-static variables, and it's not type safe. It's also relatively slow and complicated; the internal implementation consists of a whole series of classes hanging off a single thread-static field in Thread itself, using various arrays, lists, and locks for synchronization. ThreadLocal<T> is far simpler and easier to use. ThreadLocal ThreadLocal provides an abstraction around thread-static fields that allows it to be used just like any other class; it can be used as a replacement for a thread-static field, it can be used in a List<ThreadLocal<T>>, you can create as many as you need at runtime. So what does it do? It can't just have an instance-specific thread-static field, because thread-static fields have to be declared as static, and so shared between all instances of the declaring type. There's something else going on here. The values stored in instances of ThreadLocal<T> are stored in instantiations of the GenericHolder<T,U,V,W> class, which contains a single ThreadStatic field (s_value) to store the actual value. This class is then instantiated with various combinations of the Cn types for generic arguments. In .NET, each separate instantiation of a generic type has its own static state. For example, GenericHolder<int,C0,C1,C2> has a completely separate s_value field to GenericHolder<int,C1,C14,C1>. This feature is (ab)used by ThreadLocal to emulate instance thread-static fields. Every time an instance of ThreadLocal is constructed, it is assigned a unique number from the static s_currentTypeId field using Interlocked.Increment, in the FindNextTypeIndex method. The hexadecimal representation of that number then defines the specific Cn types that instantiates the GenericHolder class. That instantiation is therefore 'owned' by that instance of ThreadLocal. This gives each instance of ThreadLocal its own ThreadStatic field through a specific unique instantiation of the GenericHolder class. Although GenericHolder has four type variables, the first one is always instantiated to the type stored in the ThreadLocal<T>. This gives three free type variables, each of which can be instantiated to one of 16 types (C0 to C15). This puts an upper limit of 4096 (163) on the number of ThreadLocal<T> instances that can be created for each value of T. That is, there can be a maximum of 4096 instances of ThreadLocal<string>, and separately a maximum of 4096 instances of ThreadLocal<object>, etc. However, there is an upper limit of 16384 enforced on the total number of ThreadLocal instances in the AppDomain. This is to stop too much memory being used by thousands of instantiations of GenericHolder<T,U,V,W>, as once a type is loaded into an AppDomain it cannot be unloaded, and will continue to sit there taking up memory until the AppDomain is unloaded. The total number of ThreadLocal instances created is tracked by the ThreadLocalGlobalCounter class. So what happens when either limit is reached? Firstly, to try and stop this limit being reached, it recycles GenericHolder type indexes of ThreadLocal instances that get disposed using the s_availableIndices concurrent stack. This allows GenericHolder instantiations of disposed ThreadLocal instances to be re-used. But if there aren't any available instantiations, then ThreadLocal falls back on a standard thread local slot using TLSHolder. This makes it very important to dispose of your ThreadLocal instances if you'll be using lots of them, so the type instantiations can be recycled. The previous way of creating arbitary thread-static variables, thread data slots, was slow, clunky, and hard to use. In comparison, ThreadLocal can be used just like any other type, and each instance appears from the outside to be a non-static thread-static variable. It does this by using the CLR type system to assign each instance of ThreadLocal its own instantiated type containing a thread-static field, and so delegating a lot of the bookkeeping that thread data slots had to do to the CLR type system itself! That's a very clever use of the CLR type system.

Read the article
How much is a subscriber worth?

- by Tom Lewin

This year at Red Gate, we’ve started providing a way to back up SQL Azure databases and Azure storage. We decided to sell this as a service, instead of a product, which means customers only pay for what they use. Unfortunately for us, it makes figuring out revenue much trickier. With a product like SQL Compare, a customer pays for it, and it’s theirs for good. Sure, we offer support and upgrades, but, fundamentally, the sale is a simple, upfront transaction: we’ve made this product, you need this product, we swap product for money and everyone is happy. With software as a service, it isn’t that easy. The money and product don’t change hands up front. Instead, we provide a service in exchange for a recurring fee. We know someone buying SQL Compare will pay us $X, but we don’t know how long service customers will stay with us, or how much they will spend. How do we find this out? We use lifetime value analysis. What is lifetime value? Lifetime value, or LTV, is how much a customer is worth to the business. For Entrepreneurs has a brilliant write up that we followed to conduct our analysis. Basically, it all boils down to this equation: LTV = ARPU x ALC To make it a bit less of an alphabet-soup and a bit more understandable, we can write it out in full: The lifetime value of a customer equals the average revenue per customer per month, times the average time a customer spends with the service Simple, right? A customer is worth the average spend times the average stay. If customers pay on average $50/month, and stay on average for ten months, then a new customer will, on average, bring in $500 over the time they are a customer! Average spend is easy to work out; it’s revenue divided by customers. The problem comes when we realise that we don’t know exactly how long a customer will stay with us. How can we figure out the average lifetime of a customer, if we only have six months’ worth of data? The answer lies in the fact that: Average Lifetime of a Customer = 1 / Churn Rate The churn rate is the percentage of customers that cancel in a month. If half of your customers cancel each month, then your average customer lifetime is two months. The problem we faced was that we didn’t have enough data to make an estimate of one month’s cancellations reliable (because barely anybody cancels)! To deal with this data problem, we can take data from the last three months instead. This means we have more data to play with. We can still use the equation above, we just need to multiply the final result by three (as we worked out how many three month periods customers stay for, and we want our answer to be in months). Now these estimates are likely to be fairly unreliable; when there’s not a lot of data it pays to be cautious with inference. That said, the numbers we have look fairly consistent, and it’s super easy to revise our estimates when new data comes in. At the very least, these numbers give us a vague idea of whether a subscription business is viable. As far as Cloud Services goes, the business looks very viable indeed, and the low cancellation rates are much more than just data points in LTV equations; they show that the product is working out great for our customers, which is exactly what we’re looking for!

Read the article
Security Issues When Creating Pages in SharePoint

- by Damon

I was speaking (or rather IM'ing) with Ben Collins a while back and he came across an interesting problem that I wanted to document for the sake of posterity. If you have a SharePoint user who has permissions to create a page in a page library, but that user is having security issues trying to actually make a page, then it the security issue may be related to their access rights on the master page gallery. Users who create pages must have at least restricted read access to the master page gallery for page creation to succeed. That is one of the joys of working in SharePoint. if something doesn't show up there is usually a good but obscure reason for it, but SharePoint certainly won't tell you outright why it is. All I have to say is that I'm glad he ran into that issue and not me.

Read the article
Implementing Cluster Continuous Replication, Part 2

Cluster continuous replication (CCR) helps to provide a more resilient email system with faster recovery. It was introduced in Microsoft Exchange Server 2007 and uses log shipping and failover. configuring Cluster Continuous Replication on a Windows Server 2008 requires different techniques to Windows Server 2003. Brien Posey explains all.

Read the article
DDDSouthWest 4.0 26th May 2012 - Async 20/20 presentation

- by Liam Westley

As I wasn’t voted in with my nominated sessions I presented a 20/20 talk on the new async functionality coming with the .Net Framework. This was based on the PechaKucha presentation format, where you have only 20 slides with only 20 seconds per slide, and it progresses automatically. It was the first I’d attempted, so thanks to the organisers for allowing me to have a go. Although creating the slide deck was definitely easier than a one hour presentation, it was much more stressful giving the talk by the end of the 6m 40s. I’m not going to upload the slide deck (it won’t make much sense) but I did record the audio and used the excellent Camtasia to create a video of the slide deck with that audio which you can watch over here, https://vimeo.com/42957952

Read the article
New features in SQL Prompt 6.4

- by Tom Crossman

We’re pleased to announce a new beta version of SQL Prompt. We’ve been trying out a few new core technologies, and used them to add features and bug fixes suggested by users on the SQL Prompt forum and suggestions forum. You can download the SQL Prompt 6.4 beta here (zip file). Let us know what you think! New features Execute current statement In a query window, you can now execute the SQL statement under your cursor by pressing Shift + F5. For example, if you have a query containing two statements and your cursor is placed on the second statement: When you press Shift + F5, only the second statement is executed: Insert semicolons You can now use SQL Prompt to automatically insert missing semicolons after each statement in a query. To insert semicolons, go to the SQL Prompt menu and click Insert Semicolons. Alternatively, hold Ctrl and press B then C. BEGIN…END block highlighting When you place your cursor over a BEGIN or END keyword, SQL Prompt now automatically highlights the matching keyword: Rename variables and aliases You can now use SQL Prompt to rename all occurrences of a variable or alias in a query. To rename a variable or alias, place your cursor over an instance of the variable or alias you want to rename and press F2: Improved loading dialog box The database loading dialog box now shows actual progress, and you can cancel loading databases: Single suggestion improvement SQL Prompt no longer suggests keywords if the keyword has been typed and no other suggestions exist. Performance improvement SQL Prompt now has less impact on Management Studio start up time. What do you think? We want to hear your feedback about the beta. If you have any suggestions, or bugs to report, tell us on the SQL Prompt forum or our suggestions forum.

Read the article
A Bit Cloudy

- by Chris Massey

"Systems Administrators, I come in peace. You have nothing to fear from me" - Office 365 Microsoft Business Productivity Online Suite recently absorbed a few other services and has been rebranded as Office 365, which is currently in private Beta and NDA-d up to the eyeballs. As Microsoft's (slightly delayed) answer to Google Apps Premier Edition, it shows a lot of promise; MS has technical expertise, market penetration, and financial capital all going for it. On the other hand, Google...(read more)

Read the article
Database Migration Scripts: Getting from place A to place B

- by Phil Factor

We’ll be looking at a typical database ‘migration’ script which uses an unusual technique to migrate existing ‘de-normalised’ data into a more correct form. So, the book-distribution business that uses the PUBS database has gradually grown organically, and has slipped into ‘de-normalisation’ habits. What’s this? A new column with a list of tags or ‘types’ assigned to books. Because books aren’t really in just one category, someone has ‘cured’ the mismatch between the database and the business requirements. This is fine, but it is now proving difficult for their new website that allows searches by tags. Any request for history book really has to look in the entire list of associated tags rather than the ‘Type’ field that only keeps the primary tag. We have other problems. The TypleList column has duplicates in there which will be affecting the reporting, and there is the danger of mis-spellings getting there. The reporting system can’t be persuaded to do reports based on the tags and the Database developers are complaining about the unCoddly things going on in their database. In your version of PUBS, this extra column doesn’t exist, so we’ve added it and put in 10,000 titles using SQL Data Generator. /* So how do we refactor this database? firstly, we create a table of all the tags. */IF OBJECT_ID('TagName') IS NULL OR OBJECT_ID('TagTitle') IS NULL BEGIN CREATE TABLE TagName (TagName_ID INT IDENTITY(1,1) PRIMARY KEY , Tag VARCHAR(20) NOT NULL UNIQUE) /* ...and we insert into it all the tags from the list (remembering to take out any leading spaces */ INSERT INTO TagName (Tag) SELECT DISTINCT LTRIM(x.y.value('.', 'Varchar(80)')) AS [Tag] FROM (SELECT Title_ID, CONVERT(XML, '<list>' + REPLACE(TypeList, ',', '') + '</list>') AS XMLkeywords FROM dbo.titles)g CROSS APPLY XMLkeywords.nodes('/list/i/text()') AS x ( y ) /* we can then use this table to provide a table that relates tags to articles */ CREATE TABLE TagTitle (TagTitle_ID INT IDENTITY(1, 1), [title_id] [dbo].[tid] NOT NULL REFERENCES titles (Title_ID), TagName_ID INT NOT NULL REFERENCES TagName (Tagname_ID) CONSTRAINT [PK_TagTitle] PRIMARY KEY CLUSTERED ([title_id] ASC, TagName_ID) ON [PRIMARY]) CREATE NONCLUSTERED INDEX idxTagName_ID ON TagTitle (TagName_ID) INCLUDE (TagTitle_ID,title_id) /* ...and it is easy to fill this with the tags for each title ... */ INSERT INTO TagTitle (Title_ID, TagName_ID) SELECT DISTINCT Title_ID, TagName_ID FROM (SELECT Title_ID, CONVERT(XML, '<list>' + REPLACE(TypeList, ',', '') + '</list>') AS XMLkeywords FROM dbo.titles)g CROSS APPLY XMLkeywords.nodes('/list/i/text()') AS x ( y ) INNER JOIN TagName ON TagName.Tag=LTRIM(x.y.value('.', 'Varchar(80)')) END /* That's all there was to it. Now we can select all titles that have the military tag, just to try things out */SELECT Title FROM titles INNER JOIN TagTitle ON titles.title_ID=TagTitle.Title_ID INNER JOIN Tagname ON Tagname.TagName_ID=TagTitle.TagName_ID WHERE tagname.tag='Military'/* and see the top ten most popular tags for titles */SELECT Tag, COUNT(*) FROM titles INNER JOIN TagTitle ON titles.title_ID=TagTitle.Title_ID INNER JOIN Tagname ON Tagname.TagName_ID=TagTitle.TagName_ID GROUP BY Tag ORDER BY COUNT(*) DESC/* and if you still want your list of tags for each title, then here they are */SELECT title_ID, title, STUFF( (SELECT ','+tagname.tag FROM titles thisTitle INNER JOIN TagTitle ON titles.title_ID=TagTitle.Title_ID INNER JOIN Tagname ON Tagname.TagName_ID=TagTitle.TagName_ID WHERE ThisTitle.title_id=titles.title_ID FOR XML PATH(''), TYPE).value('.', 'varchar(max)') ,1,1,'') FROM titles ORDER BY title_ID So we’ve refactored our PUBS database without pain. We’ve even put in a check to prevent it being re-run once the new tables are created. Here is the diagram of the new tag relationship We’ve done both the DDL to create the tables and their associated components, and the DML to put the data in them. I could have also included the script to remove the de-normalised TypeList column, but I’d do a whole lot of tests first before doing that. Yes, I’ve left out the assertion tests too, which should check the edge cases and make sure the result is what you’d expect. One thing I can’t quite figure out is how to deal with an ordered list using this simple XML-based technique. We can ensure that, if we have to produce a list of tags, we can get the primary ‘type’ to be first in the list, but what if the entire order is significant? Thank goodness it isn’t in this case. If it were, we might have to revisit a string-splitter function that returns the ordinal position of each component in the sequence. You’ll see immediately that we can create a synchronisation script for deployment from a comparison tool such as SQL Compare, to change the schema (DDL). On the other hand, no tool could do the DML to stuff the data into the new table, since there is no way that any tool will be able to work out where the data should go. We used some pretty hairy code to deal with a slightly untypical problem. We would have to do this migration by hand, and it has to go into source control as a batch. If most of your database changes are to be deployed by an automated process, then there must be a way of over-riding this part of the data synchronisation process to do this part of the process taking the part of the script that fills the tables, Checking that the tables have not already been filled, and executing it as part of the transaction. Of course, you might prefer the approach I’ve taken with the script of creating the tables in the same batch as the data conversion process, and then using the presence of the tables to prevent the script from being re-run. The problem with scripting a refactoring change to a database is that it has to work both ways. If we install the new system and then have to rollback the changes, several books may have been added, or had their tags changed, in the meantime. Yes, you have to script any rollback! These have to be mercilessly tested, and put in source control just in case of the rollback of a deployment after it has been in place for any length of time. I’ve shown you how to do this with the part of the script .. /* and if you still want your list of tags for each title, then here they are */SELECT title_ID, title, STUFF( (SELECT ','+tagname.tag FROM titles thisTitle INNER JOIN TagTitle ON titles.title_ID=TagTitle.Title_ID INNER JOIN Tagname ON Tagname.TagName_ID=TagTitle.TagName_ID WHERE ThisTitle.title_id=titles.title_ID FOR XML PATH(''), TYPE).value('.', 'varchar(max)') ,1,1,'') FROM titles ORDER BY title_ID …which would be turned into an UPDATE … FROM script. UPDATE titles SET typelist= ThisTaglistFROM (SELECT title_ID, title, STUFF( (SELECT ','+tagname.tag FROM titles thisTitle INNER JOIN TagTitle ON titles.title_ID=TagTitle.Title_ID INNER JOIN Tagname ON Tagname.TagName_ID=TagTitle.TagName_ID WHERE ThisTitle.title_id=titles.title_ID ORDER BY CASE WHEN tagname.tag=titles.[type] THEN 1 ELSE 0 END DESC FOR XML PATH(''), TYPE).value('.', 'varchar(max)') ,1,1,'') AS ThisTagList FROM titles)fINNER JOIN Titles ON f.title_ID=Titles.title_ID You’ll notice that it isn’t quite a round trip because the tags are in a different order, though we’ve managed to make sure that the primary tag is the first one as originally. So, we’ve improved the database for the poor book distributors using PUBS. It is not a major deal but you’ve got to be prepared to provide a migration script that will go both forwards and backwards. Ideally, database refactoring scripts should be able to go from any version to any other. Schema synchronization scripts can do this pretty easily, but no data synchronisation scripts can deal with serious refactoring jobs without the developers being able to specify how to deal with cases like this.

Read the article
Developing Schema Compare for Oracle (Part 5): Query Snapshots

- by Simon Cooper

If you've emailed us about a bug you've encountered with the EAP or beta versions of Schema Compare for Oracle, we probably asked you to send us a query snapshot of your databases. Here, I explain what a query snapshot is, and how it helps us fix your bug. Problem 1: Debugging users' bug reports When we started the Schema Compare project, we knew we were going to get problems with users' databases - configurations we hadn't considered, features that weren't installed, unicode issues, wierd dependencies... With SQL Compare, users are generally happy to send us a database backup that we can restore using a single RESTORE DATABASE command on our test servers and immediately reproduce the problem. Oracle, on the other hand, would be a lot more tricky. As Oracle generally has a 1-to-1 mapping between instances and databases, any databases users sent would have to be restored to their own instance. Furthermore, the number of steps required to get a properly working database, and the size of most oracle databases, made it infeasible to ask every customer who came across a bug during our beta program to send us their databases. We also knew that there would be lots of issues with data security that would make it hard to get backups. So we needed an easier way to be able to debug customers issues and sort out what strange schema data Oracle was returning. Problem 2: Test execution time Another issue we knew we would have to solve was the execution time of the tests we would produce for the Schema Compare engine. Our initial prototype showed that querying the data dictionary for schema information was going to be slow (at least 15 seconds per database), and this is generally proportional to the size of the database. If you're running thousands of tests on the same databases, each one registering separate schemas, not only would the tests would take hours and hours to run, but the test servers would be hammered senseless. The solution To solve these, we needed to be able to populate the schema of a database without actually connecting to it. Well, the IDataReader interface is the primary way we read data from an Oracle server. The data dictionary queries we use return their data in terms of simple strings and numbers, which we then process and reconstruct into an object model, and the results of these queries are identical for identical schemas. So, we can record the raw results of the queries once, and then replay these results to construct the same object model as many times as required without needing to actually connect to the original database. This is what query snapshots do. They are binary files containing the raw unprocessed data we get back from the oracle server for all the queries we run on the data dictionary to get schema information. The core of the query snapshot generation takes the results of the IDataReader we get from running queries on Oracle, and passes the row data to a BinaryWriter that writes it straight to a file. The query snapshot can then be replayed to create the same object model; when the results of a specific query is needed by the population code, we can simply read the binary data stored in the file on disk and present it through an IDataReader wrapper. This is far faster than querying the server over the network, and allows us to run tests in a reasonable time. They also allow us to easily debug a customers problem; using a simple snapshot generation program, users can generate a query snapshot that could be sent along with a bug report that we can immediately replay on our machines to let us debug the issue, rather than having to obtain database backups and restore databases to test systems. There are also far fewer problems with data security; query snapshots only contain schema information, which is generally less sensitive than table data. Query snapshots implementation However, actually implementing such a feature did have a couple of 'gotchas' to it. My second blog post detailed the development of the dependencies algorithm we use to ensure we get all the dependencies in the database, and that algorithm uses data from both databases to find all the needed objects - what database you're comparing to affects what objects get populated from both databases. We get information on these additional objects using an appropriate WHERE clause on all the population queries. So, in order to accurately replay the results of querying the live database, the query snapshot needs to be a snapshot of a comparison of two databases, not just populating a single database. Furthermore, although the code population queries (eg querying all_tab_cols to get column information) can simply be passed straight from the IDataReader to the BinaryWriter, we need to hook into and run the live dependencies algorithm while we're creating the snapshot to ensure we get the same WHERE clauses, and the same query results, as if we were populating straight from a live system. We also need to store the results of the dependencies queries themselves, as the resulting dependency graph is stored within the OracleDatabase object that is produced, and is later used to help order actions in synchronization scripts. This is significantly helped by the dependencies algorithm being a deterministic algorithm - given the same input, it will always return the same output. Therefore, when we're replaying a query snapshot, and processing dependency information, we simply have to return the results of the queries in the order we got them from the live database, rather than trying to calculate the contents of all_dependencies on the fly. Query snapshots are a significant feature in Schema Compare that really helps us to debug problems with the tool, as well as making our testers happier. Although not really user-visible, they are very useful to the development team to help us fix bugs in the product much faster than we otherwise would be able to.

Read the article
Showplan Operator of the week - Assert

As part of his mission to explain the Query Optimiser in practical terms, Fabiano attempts the feat of describing, one week at a time, all the major Showplan Operators used by SQL Server's Query Optimiser to build the Query Plan. He starts with Assert

Read the article
Going by the eBook

- by Tony Davis

The book and magazine publishing world is rapidly going digital, and the industry is faced with making drastic changes to their ways of doing business. The sudden take-up of digital readers by the book-buying public has surprised even the most technological-savvy of the industry. Printed books just aren't selling like they did. In contrast, eBooks are doing well. The ePub file format is the standard around which all publishers are converging. ePub is a standard for formatting book content, so that it can be reflowed for various devices, with their widely differing screen-sizes, and can be read offline. If you unzip an ePub file, you'll find familiar formats such as XML, XHTML and CSS. This is both a blessing and a curse. Whilst it is good to be able to use familiar technologies that have been developed to a level of considerable sophistication, it doesn't get us all the way to producing a viable publication. XHTML is a page-description language, not a book-description language, as we soon found out during our initial experiments, when trying to specify headers, footers, indexes and chaptering. As a result, it is difficult to predict how any particular eBook application will decide to render a book. There isn't even a consensus as to how the cover image is specified. All of this is awkward for the publisher. Each book must be created and revised in a form from which can be generated a whole range of 'printed media', from print books, to Mobi for kindles, ePub for most Tablets and SmartPhones, HTML for excerpted chapters on websites, and a plethora of other formats for other eBook readers, each with its own idiosyncrasies. In theory, if we can get our content into a clean, semantic XML form, such as DOCBOOKS, we can, from there, after every revision, perform a series of relatively simple XSLT transformations to output anything from a HTML article, to an ePub file for reading on an iPad, to an ICML file (an XML-based file format supported by the InDesign tool), ready for print publication. As always, however, the task looks bigger the closer you get to the detail. On the way to the utopian world of an XML-based book format that encompasses all the diverse requirements of the different publication media, ePub looks like a reasonable format to adopt. Its forthcoming support for HTML 5 and CSS 3, with ePub 3.0, means that features, such as widow-and-orphan controls, multi-column flow and multi-media graphics can be incorporated into eBooks. This starts to make it possible to build an "app-like" experience into the eBook and to free publishers to think of putting context before container; to think of what content is required, be it graphical, textual or audio, from the point of view of the user, rather than what's possible in a given, traditional book "Container". In the meantime, there is a gap between what publishers require and what current technology can provide and, of course building this app-like experience is far from plain sailing. Real portability between devices is still a big challenge, and achieving the sort of wizardry seen in the likes of Theodore Grey's "Elements" eBook will require some serious device-specific programming skills. Cheers, Tony.

Read the article
Welcome to the Red Gate BI Tools Team blog!

- by BI Tools Team

Welcome to the first ever post on the brand new Red Gate Business Intelligence Tools Team blog! About the team Nick Sutherland (product manager): After many years as a software developer and project manager, Nick took an MBA and turned to product marketing. SSAS Compare is his second lean startup product (the first being SQL Connect). Follow him on Twitter. David Pond (developer): Before he joined Red Gate in 2011, David made monitoring systems for Goodyear. Follow him on Twitter. Jonathan Watts (tester): Jonathan became a tester after finishing his media degree and joining Xerox. He joined Red Gate in 2004. Follow him on Twitter. James Duffy (technical author): After a spell as a writer in the video game industry, James lived briefly in Tokyo before returning to the UK to start at Red Gate. What we're working on We launched a beta of our first tool, SSAS Compare, last month. It works like SQL Compare but for SSAS cubes, letting you deploy just the changes you want. It's completely free (for now), so check it out. We're still working on it, and we're eager to hear what you think. We hope SSAS Compare will be the first of several tools Red Gate develops for BI professionals, so keep an eye out for more from us in the future. Why we need you This is your chance to help influence the course of SSAS Compare and our future BI tools. If you're a business intelligence specialist, we want to hear about the problems you face so we can build tools that solve them. What do you want to see? Tell us! We'll be posting more about SSAS Compare, business intelligence and our journey into BI in the coming days and weeks. Stay tuned!

Read the article
Documentation and Test Assertions in Databases

- by Phil Factor

When I first worked with Sybase/SQL Server, we thought our databases were impressively large but they were, by today’s standards, pathetically small. We had one script to build the whole database. Every script I ever read was richly annotated; it was more like reading a document. Every table had a comment block, and every line would be commented too. At the end of each routine (e.g. procedure) was a quick integration test, or series of test assertions, to check that nothing in the build was broken. We simply ran the build script, stored in the Version Control System, and it pulled everything together in a logical sequence that not only created the database objects but pulled in the static data. This worked fine at the scale we had. The advantage was that one could, by reading the source code, reach a rapid understanding of how the database worked and how one could interface with it. The problem was that it was a system that meant that only one developer at the time could work on the database. It was very easy for a developer to execute accidentally the entire build script rather than the selected section on which he or she was working, thereby cleansing the database of everyone else’s work-in-progress and data. It soon became the fashion to work at the object level, so that programmers could check out individual views, tables, functions, constraints and rules and work on them independently. It was then that I noticed the trend to generate the source for the VCS retrospectively from the development server. Tables were worst affected. You can, of course, add or delete a table’s columns and constraints retrospectively, which means that the existing source no longer represents the current object. If, after your development work, you generate the source from the live table, then you get no block or line comments, and the source script is sprinkled with silly square-brackets and other confetti, thereby rendering it visually indigestible. Routines, too, were affected. In our system, every routine had a directly attached string of unit-tests. A retro-generated routine has no unit-tests or test assertions. Yes, one can still commit our test code to the VCS but it’s a separate module and teams end up running the whole suite of tests for every individual change, rather than just the tests for that routine, which doesn’t scale for database testing. With Extended properties, one can get the best of both worlds, and even use them to put blame, praise or annotations into your VCS. It requires a lot of work, though, particularly the script to generate the table. The problem is that there are no conventional names beyond ‘MS_Description’ for the special use of extended properties. This makes it difficult to do splendid things such ensuring the integrity of the build by running a suite of tests that are actually stored in extended properties within the database and therefore the VCS. We have lost the readability of database source code over the years, and largely jettisoned the use of test assertions as part of the database build. This is not unexpected in view of the increasing complexity of the structure of databases and number of programmers working on them. There must, surely, be a way of getting them back, but I sometimes wonder if I’m one of very few who miss them.

Read the article
Metrics - A little knowledge can be a dangerous thing (or 'Why you're not clever enough to interpret metrics data')

- by Jason Crease

At RedGate Software, I work on a .NET obfuscator called SmartAssembly. Various features of it use a database to store various things (exception reports, name-mappings, etc.) The user is given the option of using either a SQL-Server database (which requires them to have Microsoft SQL Server), or a Microsoft Access MDB file (which requires nothing). MDB is the default option, but power-users soon switch to using a SQL Server database because it offers better performance and data-sharing. In the fashionable spirit of optimization and metrics, an obvious product-management question is 'Which is the most popular? SQL Server or MDB?' We've collected data about this fact, using our 'Feature-Usage-Reporting' technology (available as part of SmartAssembly) and more recently our 'Application Metrics' technology: Parameter Number of users % of total users Number of sessions Number of usages SQL Server 28 19.0 8115 8115 MDB 114 77.6 1449 1449 (As a disclaimer, please note than SmartAssembly has far more than 132 users . This data is just a selection of one build) So, it would appear that SQL-Server is used by fewer users, but more often. Great. But here's why these numbers are useless to me: Only the original developers understand the data What does a single 'usage' of 'MDB' mean? Does this happen once per run? Once per option change? On clicking the 'Obfuscate Now' button? When running the command-line version or just from the UI version? Each question could skew the data 10-fold either way, and the answers only known by the developer that instrumented the application in the first place. In other words, only the original developer can interpret the data - product-managers cannot interpret the data unaided. Most of the data is from uninterested users About half of people who download and run a free-trial from the internet quit it almost immediately. Only a small fraction use it sufficiently to make informed choices. Since the MDB option is the default one, we don't know how many of those 114 were people CHOOSING to use the MDB, or how many were JUST HAPPENING to use this MDB default for their 20-second trial. This is a problem we see across all our metrics: Are people are using X because it's the default or are they using X because they want to use X? We need to segment the data further - asking what percentage of each percentage meet our criteria for an 'established user' or 'informed user'. You end up spending hours writing sophisticated and dubious SQL queries to segment the data further. Not fun. You can't find out why they used this feature Metrics can answer the when and what, but not the why. Why did people use feature X? If you're anything like me, you often click on random buttons in unfamiliar applications just to explore the feature-set. If we listened uncritically to metrics at RedGate, we would eliminate the most-important and more-complex features which people actually buy the software for, leaving just big buttons on the main page and the About-Box. "Ah, that's interesting!" rather than "Ah, that's actionable!" People do love data. Did you know you eat 1201 chickens in a lifetime? But just 4 cows? Interesting, but useless. Often metrics give you a nice number: '5.8% of users have 3 or more monitors' . But unless the statistic is both SUPRISING and ACTIONABLE, it's useless. Most metrics are collected, reviewed with lots of cooing. and then forgotten. Unless a piece-of-data could change things, it's useless collecting it. People get obsessed with significance levels The first things that lots of people do with this data is do a t-test to get a significance level ("Hey! We know with 99.64% confidence that people prefer SQL Server to MDBs!") Believe me: other causes of error/misinterpretation in your data are FAR more significant than your t-test could ever comprehend. Confirmation bias prevents objectivity If the data appears to match our instinct, we feel satisfied and move on. If it doesn't, we suspect the data and dig deeper, plummeting down a rabbit-hole of segmentation and filtering until we give-up and move-on. Data is only useful if it can change our preconceptions. Do you trust this dodgy data more than your own understanding, knowledge and intelligence? I don't. There's always multiple plausible ways to interpret/action any data Let's say we segment the above data, and get this data: Post-trial users (i.e. those using a paid version after the 14-day free-trial is over): Parameter Number of users % of total users Number of sessions Number of usages SQL Server 13 9.0 1115 1115 MDB 5 4.2 449 449 Trial users: Parameter Number of users % of total users Number of sessions Number of usages SQL Server 15 10.0 7000 7000 MDB 114 77.6 1000 1000 How do you interpret this data? It's one of: Mostly SQL Server users buy our software. People who can't afford SQL Server tend to be unable to afford or unwilling to buy our software. Therefore, ditch MDB-support. Our MDB support is so poor and buggy that our massive MDB user-base doesn't buy it. Therefore, spend loads of money improving it, and think about ditching SQL-Server support. People 'graduate' naturally from MDB to SQL Server as they use the software more. Things are fine the way they are. We're marketing the tool wrong. The large number of MDB users represent uninformed downloaders. Tell marketing to aggressively target SQL Server users. To choose an interpretation you need to segment again. And again. And again, and again. Opting-out is correlated with feature-usage Metrics tends to be opt-in. This skews the data even further. Between 5% and 30% of people choose to opt-in to metrics (often called 'customer improvement program' or something like that). Casual trial-users who are uninterested in your product or company are less likely to opt-in. This group is probably also likely to be MDB users. How much does this skew your data by? Who knows? It's not all doom and gloom. There are some things metrics can answer well. Environment facts. How many people have 3 monitors? Have Windows 7? Have .NET 4 installed? Have Japanese Windows? Minor optimizations. Is the text-box big enough for average user-input? Performance data. How long does our app take to start? How many databases does the average user have on their server? As you can see, questions about who-the-user-is rather than what-the-user-does are easier to answer and action. Conclusion Use SmartAssembly. If not for the metrics (called 'Feature-Usage-Reporting'), then at least for the obfuscation/error-reporting. Data raises more questions than it answers. Questions about environment are the easiest to answer.

Read the article
SQLIO Writes

- by Grant Fritchey

SQLIO is a fantastic utility for testing the abilities of the disks in your system. It has a very unfortunate name though, since it's not really a SQL Server testing utility at all. It really is a disk utility. They ought to call it DiskIO because they'd get more people using I think. Anyway, branding is not the point of this blog post. Writes are the point of this blog post. SQLIO works by slamming your disk. It performs as mean reads as it can or it performs as many writes as it can depending on how you've configured your tests. There are much smarter people than me who will get into all the various types of tests you should run. I'd suggest reading a bit of what Jonathan Kehayias (blog|twitter) has to say or wade into Denny Cherry's (blog|twitter) work. They're going to do a better job than I can describing all the benefits and mechanisms around using this excellent piece of software. My concerns are very focused. I needed to set up a series of tests to see how well our product SQL Storage Compress worked. I wanted to know the effects it would have on a system, the disk for sure, but also memory and CPU. How to stress the system? SQLIO of course. But when I set it up and ran it, following the documentation that comes with it, I was seeing better than 99% compression on the files. Don't get me wrong. Our product is magnificent, wonderful, all things great and beautiful, gets you coffee in the morning and is made mostly from bacon. But 99% compression. No, it's not that good. So what's up? Well, it's the configuration. The default mechanism is to load up a file, something large that will overwhelm your disk cache. You're instructed to load the file with a character 0x0. I never got a computer science degree. I went to film school. Because of this, I didn't memorize ASCII tables so when I saw this, I thought it was zero's or something. Nope. It's NULL. That's right, you're making a very large file, but you're filling it with NULL values. That's actually ok when all you're testing is the disk sub-system. But, when you want to test a compression and decompression, that can be an issue. I got around this fairly quickly. Instead of generating a file filled with NULL values, I just copied a database file for my tests. And to test it with SQL Storage Compress, I used a database file that had already been run through compression (about 40% compression on that file if you're interested). Now the reads were taken care of. I am seeing very realistic performance from decompressing the information for reads through SQLIO. But what about writes? Well, the issue is, what does SQLIO write? I don't have access to the code. But I do have access to the results. I did two different tests, just to be sure of what I was seeing. First test, use the .DAT file as described in the documentation. I opened the .DAT file after I was done with SQLIO, using WordPad. Guess what? It's a giant file full of air. SQLIO writes NULL values. What does that do to compression? I did the test again on a copy of an uncompressed database file. Then I ran the original and the SQLIO modified copy through ZIP to see what happened. I got better than 99% compression out of the SQLIO modified file (original file of 624,896kb went to 275,871kb compressed, after SQLIO it went to 608kb compressed). So, what does SQLIO write? It writes air. If you're trying to test it with compression or maybe some other type of file storage mechanism like dedupe, you need to know this because your tests really won't be valid. Should I find some other mechanism for testing? Yeah, if all I'm interested in is establishing performance to my own satisfaction, yes. But, I want to be able to compare my results with other people's results and we all need to be using the same tool in order for that to happen. SQLIO is the common mechanism that most people I know use to establish disk performance behavior. It'd be better if we could get SQLIO to do writes in some other fashion. Oh, and before I go, I get to brag a bit. Measuring IOPS, SQL Storage Compress outperforms my disk alone by about 30%.

Read the article
Why enumerator structs are a really bad idea (redux)

- by Simon Cooper

My previous blog post went into some detail as to why calling MoveNext on a BCL generic collection enumerator didn't quite do what you thought it would. This post covers the Reset method. To recap, here's the simple wrapper around a linked list enumerator struct from my previous post (minus the readonly on the enumerator variable): sealed class EnumeratorWrapper : IEnumerator<int> { private LinkedList<int>.Enumerator m_Enumerator; public EnumeratorWrapper(LinkedList<int> linkedList) { m_Enumerator = linkedList.GetEnumerator(); } public int Current { get { return m_Enumerator.Current; } } object System.Collections.IEnumerator.Current { get { return Current; } } public bool MoveNext() { return m_Enumerator.MoveNext(); } public void Reset() { ((System.Collections.IEnumerator)m_Enumerator).Reset(); } public void Dispose() { m_Enumerator.Dispose(); } } If you have a look at the Reset method, you'll notice I'm having to cast to IEnumerator to be able to call Reset on m_Enumerator. This is because the implementation of LinkedList<int>.Enumerator.Reset, and indeed of all the other Reset methods on the BCL generic collection enumerators, is an explicit interface implementation. However, IEnumerator is a reference type. LinkedList<int>.Enumerator is a value type. That means, in order to call the reset method at all, the enumerator has to be boxed. And the IL confirms this: .method public hidebysig newslot virtual final instance void Reset() cil managed { .maxstack 8 L_0000: nop L_0001: ldarg.0 L_0002: ldfld valuetype [System]System.Collections.Generic.LinkedList`1/Enumerator<int32> EnumeratorWrapper::m_Enumerator L_0007: box [System]System.Collections.Generic.LinkedList`1/Enumerator<int32> L_000c: callvirt instance void [mscorlib]System.Collections.IEnumerator::Reset() L_0011: nop L_0012: ret } On line 0007, we're doing a box operation, which copies the enumerator to a reference object on the heap, then on line 000c calling Reset on this boxed object. So m_Enumerator in the wrapper class is not modified by the call the Reset. And this is the only way to call the Reset method on this variable (without using reflection). Therefore, the only way that the collection enumerator struct can be used safely is to store them as a boxed IEnumerator<T>, and not use them as value types at all.

Read the article
It’s official – Red Gate is a great place to work!

- by red@work

At a glittering award ceremony last week, we found out that we’re officially the 14th best small company to work for in the whole of the UK! This is no mean feat, considering that about 1,000 companies enter the Sunday Times Top 100 best companies awards each year. Most of these are in the small companies category too. It's the fourth year in a row for us to be in the Top 100 list and we're tickled pink because the results are based on employee opinion. We’re particularly proud to be the best small company in Cambridge (in the whole of East Anglia, in fact) and the best small software development company in the entire UK. So how does it all work? Well, 90% of us took the time to answer over 70 questions on categories such as management, benefits, wellbeing, leadership, giving something back and what we think of Red Gate as a whole. It makes you think about every part of day to day working life and how you feel about it. Do you slightly or strongly agree or disagree that your manager motivates your to do your best every day, or that you have confidence in Red Gate's leaders, or that you’re not spending too much time working? It's great to see that we had one of the best scores in the country for the question "Do you think your company takes advantage of you?" We got particularly high scores for management, wellbeing and for giving something back too. A few of us got dressed up and headed to London for the awards; very excited about where we’d place but slightly nervous about having to get up on stage. There was a last minute hic up with a bow tie but the Managing Editor of the Sunday Times kindly stepped in to offer his assistance just before we had our official photo taken. We were nominated for two Special Recognition Awards. Despite not bringing them home this year, we're very proud to be nominated as there are only three nominations in each category. First we were up for the Training and Development award. Best Companies loved that we get together at lunchtimes to teach each other photography, cookery and French, as well as our book clubs and techie talks. And of course they liked our opportunities to go on training courses and to jet off to international conferences. Our other nomination was for the Wellbeing award. Best Companies loved our free food (and let’s face it, so do we). Porridge or bacon sandwiches for breakfast, a three course hot dinner, and free fruit and cereals all day long. If all that has an affect on the waistline then there are plenty of sporty activities for us all to get involved in, such as yoga, running or squash. Or if that’s not your thing then a relaxing massage helps us all to unwind every few months or so. The awards were hosted by news presenter Kate Silverton. She gave us a special mention during the ceremony for having great customer engagement as well as employee engagement, after we told her about Rodney Landrum (a Friend of Red Gate) tattooing our logo on his arm. We showed off our customised dinner jacket (thanks to Dom from Usability) with a flashing Red Gate logo on the back and she seemed suitability impressed. Back in the office the next day, we popped open the champagne and raised a glass to our success. Neil, our joint CEO, talked about how pleased he was with the award because it's based on the opinions of the people that count – us. You can read more about the Sunday Times awards here. By the way, we're still growing and are still hiring. If you’d like to keep up with our latest vacancies then why not follow us on Twitter at twitter.com/redgatecareers. Right now we're busy hiring in development, test, sales, product management, web development, and project management. Here's a link to our current job opportunities page – we'd love to hear from great people who are looking for a great place to work! After all, we're only great because of the people who work here. Post by: Alice Chapman

Read the article
SharePoint: Numeric/Integer Site Column (Field) Types

- by CharlesLee

What field type should you use when creating number based site columns as part of a SharePoint feature? Windows SharePoint Services 3.0 provides you with an extensible and flexible method of developing and deploying Site Columns and Content Types (both of which are required for most SharePoint projects requiring list or library based data storage) via the feature framework (more on this in my next full article.) However there is an interesting behaviour when working with a column or field which is required to hold a number, which I thought I would blog about today. When creating Site Columns in the browser you get a nice rich UI in order to choose the properties of this field: However when you are recreating this as a feature defined in CAML (Collaborative Application Mark-up Language), which is a type of XML (more on this in my article) then you do not get such a rich experience. You would need to add something like this to the element manifest defined in your feature: <Field SourceID="http://schemas.microsoft.com/sharepoint/3.0" ID="{C272E927-3748-48db-8FC0-6C7B72A6D220}" Group="My Site Columns" Name="MyNumber" DisplayName="My Number" Type="Numeric" Commas="FALSE" Decimals="0" Required="FALSE" ReadOnly="FALSE" Sealed="FALSE" Hidden="FALSE" /> OK, its not as nice as the browser UI but I can deal with this. Hang on. Commas="FALSE" and yet for my number 1234 I get 1,234. That is not what I wanted or expected. What gives? The answer lies in the difference between a type of "Numeric" which is an implementation of the SPFieldNumber class and "Integer" which does not correspond to a given SPField class but rather represents a positive or negative integer. The numeric type does not respect the settings of Commas or NegativeFormat (which defines how to display negative numbers.) So we can set the Type to Integer and we are good to go. Yes? Sadly no! You will notice at this point that if you deploy your site column into SharePoint something has gone wrong. Your site column is not listed in the Site Column Gallery. The deployment must have failed then? But no, a quick look at the site columns via the API reveals that the column is there. What new evil is this? Unfortunately the base type for integer fields has this lovely attribute set on it: UserCreatable = FALSE So WSS 3.0 accordingly hides your field in the gallery as you cannot create fields of this type. However! You can use them in content types just like any other field (except not in the browser UI), and if you add them to the content type as part of your feature then they will show up in the UI as a field on that content type. Most of the time you are not going to be too concerned that your site columns are not listed in the gallery as you will know that they are there and that they are still useable. So not as bad as you thought after all. Just a little quirky. But that is SharePoint for you.

Read the article
The Great PST Migration

Having recently been on the front lines of a massive PST import operation, Sean Duffy offers advice and points out pitfalls. More than anything, he wishes he had a simple tool with which to banish PST hell, and finishes with some hard-won guidelines.

Read the article
Turnkey with LightSwitch

- by Laila

Microsoft has long wanted to find a replacement for Microsoft Access. The best attempt yet, which is due out in, or before, September is Visual Studio LightSwitch, with which it is said to be as 'easy as flipping a switch' to use Silverlight to create simple form-driven business applications. It is easy to get confused by the various initiatives from Microsoft. No, this isn't WebMatrix. There is no 'Razor', for this isn't meant for cute little ecommerce sites, but is designed to build simple database-applications of the card-box type. It is more clearly a .NET-based solution to the problem that every business seems to suffer from; the plethora of Access-based, and Excel-based 'private' and departmental database-applications. These are a nightmare for any IT department since they are often 'stealth' applications built by the business in the teeth of opposition from the IT Department zealots. As they are undocumented, it is scarily easy to bring a whole department into disarray by decommissioning a PC tucked under a desk somewhere. With LightSwitch, it is easy to re-write such applications in a standard, maintainable, way, using a SQL Server database, deployed somewhere reasonably safe such as Azure. Even Sharepoint or Windows Communication Foundation can be used as data sources. Oracle's ApEx has taken off remarkably well, and has shaken the perception that, for the business user, Oracle must remain a mystic force accessible only to the priests and acolytes. Microsoft, by comparison had only Access, which was first released in 1992, the year of the Madonna conical bustier. It looks just as dated. Microsoft badly needed an entirely new solution to the same business requirement that led to Access's and Foxpro's long-time popularity, but which had the same allure as ApEx. LightSwitch is sound in its ideas, and comfortingly conventional in its architecture. By giving an easy access to SQL Server databases, and providing a 'thumb and blanket' migration path to Access-heads, LightSwitch seems likely to offer a simple way of pulling more Microsoft users into the .NET community. If Microsoft puts its weight behind it, then it will give some glimmer of hope to the many Silverlight developers that Microsoft is capable of seeing through its .NET revolution.

Read the article
Defining .NET Components with Namespaces

A .NET software component is a compiled set of classes that provide a programmable interface that is used by consumer applications for a service. As a component is no more than a logical grouping of classes, what then is the best way to define the boundaries of a component within the .NET framework? How should the classes inter-operate? Patrick Smacchia, the lead developer of NDepend, discusses the issues and comes up with a solution.

Read the article
Data Model Dissonance

- by Tony Davis

So often at the start of the development of database applications, there is a premature rush to the keyboard. Unless, before we get there, we’ve mapped out and agreed the three data models, the Conceptual, the Logical and the Physical, then the inevitable refactoring will dog development work. It pays to get the data models sorted out up-front, however ‘agile’ you profess to be. The hardest model to get right, the most misunderstood, and the one most neglected by the various modeling tools, is the conceptual data model, and yet it is critical to all that follows. The conceptual model distils what the business understands about itself, and the way it operates. It represents the business rules that govern the required data, its constraints and its properties. The conceptual model uses the terminology of the business and defines the most important entities and their inter-relationships. Don’t assume that the organization’s understanding of these business rules is consistent or accurate. Too often, one department has a subtly different understanding of what an entity means and what it stores, from another. If our conceptual data model fails to resolve such inconsistencies, it will reduce data quality. If we don’t collect and measure the raw data in a consistent way across the whole business, how can we hope to perform meaningful aggregation? The conceptual data model has more to do with business than technology, and as such, developers often regard it as a worthy but rather arcane ceremony like saluting the flag or only eating fish on Friday. However, the consequences of getting it wrong have a direct and painful impact on many aspects of the project. If you adopt a silo-based (a.k.a. Domain driven) approach to development), you are still likely to suffer by starting with an incomplete knowledge of the domain. Even when you have surmounted these problems so that the data entities accurately reflect the business domain that the application represents, there are likely to be dire consequences from abandoning the goal of a shared, enterprise-wide understanding of the business. In reading this, you may recall experiences of the consequence of getting the conceptual data model wrong. I believe that Phil Factor, for example, witnessed the abandonment of a multi-million dollar banking project due to an inadequate conceptual analysis of how the bank defined a ‘customer’. We’d love to hear of any examples you know of development projects poleaxed by errors in the conceptual data model. Cheers, Tony

Read the article
What Counts For a DBA: Fitness

- by Louis Davidson

If you know me, you can probably guess that physical exercise is not really my thing. There was a time in my past when it a larger part of my life, but even then never in the same sort of passionate way as a number of our SQL friends. For me, I find that mental exercise satisfies what I believe to be the same inner need that drives people to run farther than I like to drive on most Saturday mornings, and it is certainly just as addictive. Mental fitness shares many common traits with physical fitness, especially the need to attain it through repetitive training. I only wish that mental training burned off a bacon cheeseburger in the same manner as does jogging around a dewy park on Saturday morning. In physical training, there are at least two goals, the first of which is to be physically able to do a task. The second is to train the brain to perform the task without thinking too hard about it. No matter how long it has been since you last rode a bike, you will be almost certainly be able to hop on and start riding without thinking about the process of pedaling or balancing. If you’ve never ridden a bike, you could be a physics professor /Olympic athlete and still crash the first few times you try, even though you are as strong as an ox and your knowledge of the physics of bicycle riding makes the concept child’s play. For programming tasks, the process is very similar. As a DBA, you will come to know intuitively how to backup, optimize, and secure database systems. As a data programmer, you will work to instinctively use the clauses of Transact-SQL DML so that, when you need to group data three ways (and not four), you will know to use the GROUP BY clause with GROUPING SETS without resorting to a search engine. You have the skill. Making it naturally then requires repetition and experience is the primary requirement, not just simply learning about a topic. The hardest part of being really good at something is this difference between knowledge and skill. I have recently taken several informative training classes with Kimball University on data warehousing and ETL. Now I have a lot more knowledge about designing data warehouses than before. I have also done a good bit of data warehouse designing of late and have started to improve to some level of proficiency with the theory. Yet, for all of this head knowledge, it is still a struggle to take what I have learned and apply it to the designs I am working on. Data warehousing is still a task that is not yet deeply ingrained in my brain muscle memory. On the other hand, relational database design is something that no matter how much or how little I may get to do it, I am comfortable doing it. I have done it as a profession now for well over a decade, I teach classes on it, and I also have done (and continue to do) a lot of mental training beyond the work day. Sometimes the training is just basic education, some reading blogs and attending sessions at PASS events. My best training comes from spending time working on other people’s design issues in forums (though not nearly as much as I would like to lately). Working through other people’s problems is a great way to exercise your brain on problems with which you’re not immediately familiar. The final bit of exercise I find useful for cultivating mental fitness for a data professional is also probably the nerdiest thing that I will ever suggest you do. Akin to running in place, the idea is to work through designs in your head. I have designed more than one database system that would revolutionize grocery store operations, sales at my local Target store, the ordering process at Amazon, and ways to improve Disney World operations to get me through a line faster (some of which they are starting to implement without any of my help.) Never are the designs truly fleshed out, but enough to work through structures and processes. On “paper”, I have designed database systems to catalog things as trivial as my Lego creations, rental car companies and my audio and video collections. Once I get the database designed mentally, sometimes I will create the database, add some data (often using Red-Gate’s Data Generator), and write a few queries to see if a concept was realistic, but I will rarely fully flesh out the database since I have no desire to do any user interface programming anymore. The mental training allows me to keep in practice for when the time comes to do the work I love the most for real…even if I have been spending most of my work time lately building data warehouses. If you are really strong of mind and body, perhaps you can mix a mental run with a physical run; though don’t run off of a cliff while contemplating how you might design a database to catalog the trees on a mountain…that would be contradictory to the purpose of both types of exercise.

Read the article

Search Results

Search found 3797 results on 152 pages for 'talk'.

Page 34/152 | < Previous Page | 30 31 32 33 34 35 36 37 38 39 40 41 | Next Page >

- by Damon Armstrong

- by Damon

- by Simon Cooper

- by Tom Lewin

- by Damon

- by Liam Westley

- by Tom Crossman

- by Chris Massey

- by Phil Factor

- by Simon Cooper

- by Tony Davis

- by BI Tools Team

- by Phil Factor

- by Jason Crease

- by Grant Fritchey

- by Simon Cooper

- by red@work

- by CharlesLee

- by Laila

- by Tony Davis

- by Louis Davidson

< Previous Page | 30 31 32 33 34 35 36 37 38 39 40 41 | Next Page >