Search Results

Search found 47615 results on 1905 pages for 'make it useful keep it simple'.

Page 161/1905 | < Previous Page | 157 158 159 160 161 162 163 164 165 166 167 168 | Next Page >

Hype and LINQ

- by Tony Davis

"Tired of querying in antiquated SQL?" I blinked in astonishment when I saw this headline on the LinqPad site. Warming to its theme, the site suggests that what we need is to "kiss goodbye to SSMS", and instead use LINQ, a modern query language! Elsewhere, there is an article entitled "Why LINQ beats SQL". The designers of LINQ, along with many DBAs, would, I'm sure, cringe with embarrassment at the suggestion that LINQ and SQL are, in any sense, competitive ways of doing the same thing. In fact what LINQ really is, at last, is an efficient, declarative language for C# and VB programmers to access or manipulate data in objects, local data stores, ORMs, web services, data repositories, and, yes, even relational databases. The fact is that LINQ is essentially declarative programming in a .NET language, and so in many ways encourages developers into a "SQL-like" mindset, even though they are not directly writing SQL. In place of imperative logic and loops, it uses various expressions, operators and declarative logic to build up an "expression tree" describing only what data is required, not the operations to be performed to get it. This expression tree is then parsed by the language compiler, and the result, when used against a relational database, is a SQL string that, while perhaps not always perfect, is often correctly parameterized and certainly no less "optimal" than what is achieved when a developer applies blunt, imperative logic to the SQL language. From a developer standpoint, it is a mistake to consider LINQ simply as a substitute means of querying SQL Server. The strength of LINQ is that that can be used to access any data source, for which a LINQ provider exists. Microsoft supplies built-in providers to access not just SQL Server, but also XML documents, .NET objects, ADO.NET datasets, and Entity Framework elements. LINQ-to-Objects is particularly interesting in that it allows a declarative means to access and manipulate arrays, collections and so on. Furthermore, as Michael Sorens points out in his excellent article on LINQ, there a whole host of third-party LINQ providers, that offers a simple way to get at data in Excel, Google, Flickr and much more, without having to learn a new interface or language. Of course, the need to be generic enough to deal with a range of data sources, from something as mundane as a text file to as esoteric as a relational database, means that LINQ is a compromise and so has inherent limitations. However, it is a powerful and beautifully compact language and one that, at least in its "query syntax" guise, is accessible to developers and DBAs alike. Perhaps there is still hope that LINQ can fulfill Phil Factor's lobster-induced fantasy of a language that will allow us to "treat all data objects, whether Word files, Excel files, XML, relational databases, text files, HTML files, registry files, LDAPs, Outlook and so on, in the same logical way, as linked databases, and extract the metadata, create the entities and relationships in the same way, and use the same SQL syntax to interrogate, create, read, write and update them." Cheers, Tony.

Read the article
Why enumerator structs are a really bad idea

- by Simon Cooper

If you've ever poked around the .NET class libraries in Reflector, I'm sure you would have noticed that the generic collection classes all have implementations of their IEnumerator as a struct rather than a class. As you will see, this design decision has some rather unfortunate side effects... As is generally known in the .NET world, mutable structs are a Very Bad Idea; and there are several other blogs around explaining this (Eric Lippert's blog post explains the problem quite well). In the BCL, the generic collection enumerators are all mutable structs, as they need to keep track of where they are in the collection. This bit me quite hard when I was coding a wrapper around a LinkedList<int>.Enumerator. It boils down to this code: sealed class EnumeratorWrapper : IEnumerator<int> { private readonly LinkedList<int>.Enumerator m_Enumerator; public EnumeratorWrapper(LinkedList<int> linkedList) { m_Enumerator = linkedList.GetEnumerator(); } public int Current { get { return m_Enumerator.Current; } } object System.Collections.IEnumerator.Current { get { return Current; } } public bool MoveNext() { return m_Enumerator.MoveNext(); } public void Reset() { ((System.Collections.IEnumerator)m_Enumerator).Reset(); } public void Dispose() { m_Enumerator.Dispose(); } } The key line here is the MoveNext method. When I initially coded this, I thought that the call to m_Enumerator.MoveNext() would alter the enumerator state in the m_Enumerator class variable and so the enumeration would proceed in an orderly fashion through the collection. However, when I ran this code it went into an infinite loop - the m_Enumerator.MoveNext() call wasn't actually changing the state in the m_Enumerator variable at all, and my code was looping forever on the first collection element. It was only after disassembling that method that I found out what was going on The MoveNext method above results in the following IL: .method public hidebysig newslot virtual final instance bool MoveNext() cil managed { .maxstack 1 .locals init ( [0] bool CS$1$0000, [1] valuetype [System]System.Collections.Generic.LinkedList`1/Enumerator CS$0$0001) L_0000: nop L_0001: ldarg.0 L_0002: ldfld valuetype [System]System.Collections.Generic.LinkedList`1/Enumerator EnumeratorWrapper::m_Enumerator L_0007: stloc.1 L_0008: ldloca.s CS$0$0001 L_000a: call instance bool [System]System.Collections.Generic.LinkedList`1/Enumerator::MoveNext() L_000f: stloc.0 L_0010: br.s L_0012 L_0012: ldloc.0 L_0013: ret } Here, the important line is 0002 - m_Enumerator is accessed using the ldfld operator, which does the following: Finds the value of a field in the object whose reference is currently on the evaluation stack. So, what the MoveNext method is doing is the following: public bool MoveNext() { LinkedList<int>.Enumerator CS$0$0001 = this.m_Enumerator; bool CS$1$0000 = CS$0$0001.MoveNext(); return CS$1$0000; } The enumerator instance being modified by the call to MoveNext is the one stored in the CS$0$0001 variable on the stack, and not the one in the EnumeratorWrapper class instance. Hence why the state of m_Enumerator wasn't getting updated. Hmm, ok. Well, why is it doing this? If you have a read of Eric Lippert's blog post about this issue, you'll notice he quotes a few sections of the C# spec. In particular, 7.5.4: ...if the field is readonly and the reference occurs outside an instance constructor of the class in which the field is declared, then the result is a value, namely the value of the field I in the object referenced by E. And my m_Enumerator field is readonly! Indeed, if I remove the readonly from the class variable then the problem goes away, and the code works as expected. The IL confirms this: .method public hidebysig newslot virtual final instance bool MoveNext() cil managed { .maxstack 1 .locals init ( [0] bool CS$1$0000) L_0000: nop L_0001: ldarg.0 L_0002: ldflda valuetype [System]System.Collections.Generic.LinkedList`1/Enumerator EnumeratorWrapper::m_Enumerator L_0007: call instance bool [System]System.Collections.Generic.LinkedList`1/Enumerator::MoveNext() L_000c: stloc.0 L_000d: br.s L_000f L_000f: ldloc.0 L_0010: ret } Notice on line 0002, instead of the ldfld we had before, we've got a ldflda, which does this: Finds the address of a field in the object whose reference is currently on the evaluation stack. Instead of loading the value, we're loading the address of the m_Enumerator field. So now the call to MoveNext modifies the enumerator stored in the class rather than on the stack, and everything works as expected. Previously, I had thought enumerator structs were an odd but interesting feature of the BCL that I had used in the past to do linked list slices. However, effects like this only underline how dangerous mutable structs are, and I'm at a loss to explain why the enumerators were implemented as structs in the first place. (interestingly, the SortedList<TKey, TValue> enumerator is a struct but is private, which makes it even more odd - the only way it can be accessed is as a boxed IEnumerator!). I would love to hear people's theories as to why the enumerators are implemented in such a fashion. And bonus points if you can explain why LinkedList<int>.Enumerator.Reset is an explicit implementation but Dispose is implicit... Note to self: never ever ever code a mutable struct.

Read the article
Agile Testing Days 2012 – Day 3 – Agile or agile?

- by Chris George

Another early start for my last Lean Coffee of the conference, and again it was not wasted. We had some really interesting discussions around how to determine what test automation is useful, if agile is not faster, why do it? and a rather existential discussion on whether unicorns exist! First keynote of the day was entitled “Fast Feedback Teams” by Ola Ellnestam. Again this relates nicely to the releasing faster talk on day 2, and something that we are looking at and some teams are actively trying. Introducing the notion of feedback, Ola describes a game he wrote for his eldest child. It was a simple game where every time he clicked a button, it displayed “You’ve Won!”. He then changed it to be a Win-Lose-Win-Lose pattern and watched the feedback from his son who then twigged the pattern and got his younger brother to play, alternating turns… genius! (must do that with my children). The idea behind this was that you need that feedback loop to learn and progress. If you are not getting the feedback you need to close that loop. An interesting point Ola made was to solve problems BEFORE writing software. It may be that you don’t have to write anything at all, perhaps it’s a communication/training issue? Perhaps the problem can be solved another way. Writing software, although it’s the business we are in, is expensive, and this should be taken into account. He again mentions frequent releases, and how they should be made as soon as stuff is ready to be released, don’t leave stuff on the shelf cause it’s not earning you anything, money or data. I totally agree with this and it’s something that we will be aiming for moving forwards. “Exceptions, Assumptions and Ambiguity: Finding the truth behind the story” by David Evans started off very promising by making references to ‘Grim up North’ referring to the north of England. Not sure it was appreciated by most of the audience, but it made me laugh! David explained how there are always risks associated with exceptions, giving the example of a one-way road near where he lives, with an exception sign giving rights to coaches to go the wrong way. Therefore you could merrily swing around the corner of the one way road straight into a coach! David showed the danger in making assumptions with lyrical quotes from Lola by The Kinks “I’m glad I’m a man, and so is Lola” and with a picture of a toilet flush that needed instructions to operate the full and half flush. With this particular flush, you pulled the handle all the way down to half flush, and half way down to full flush! hmmm, a bit of a crappy user experience methinks! Then through a clever use of a passage from the Jabberwocky, David then went onto show how mis-translation/ambiguity is the can completely distort the original meaning of something, and this is a real enemy of software development. This was all helping to demonstrate that the term Story is often heavily overloaded in the Agile world, and should really be stripped back to what it is really for, stating a business problem, and offering a technical solution. Therefore a story could be worded as “In order to {make some improvement}, we will { do something}”. The first ‘in order to’ statement is stakeholder neutral, and states the problem through requesting an improvement to the software/process etc. The second part of the story is the verb, the doing bit. So to achieve the ‘improvement’ which is not currently true, we will do something to make this true in the future. My PM is very interested in this, and he’s observed some of the problems of overloading stories so I’m hoping between us we can use some of David’s suggestions to help clarify our stories better. The second keynote of the day (and our last) proved to be the most entertaining and exhausting of the conference for me. “The ongoing evolution of testing in agile development” by Scott Barber. I’ve never had the pleasure of seeing Scott before… OMG I would love to have even half of the energy he has! What struck me during this presentation was Scott’s explanation of how testing has become the role/job that it is (largely) today, and how this has led to the need for ‘methodologies’ to make dev and test work! The argument that we should be trying to converge the roles again is a very valid one, and one that a couple of the teams at work are actively doing with great results. Making developers as responsible for quality as testers is something that has been lost over the years, but something that we are now striving to achieve. The idea that we (testers) should be testing experts/specialists, not testing ‘union members’, supports this idea so the entire team works on all aspects of a feature/product, with the ‘specialists’ taking the lead and advising/coaching the others. This leads to better propagation of information around the team, a greater holistic understanding of the project and it allows the team to continue functioning if some of it’s members are off sick, for example. Feeling somewhat drained from Scott’s keynote (but at the same time excited that alot of the points he raised supported actions we are taking at work), I headed into my last presentation for Agile Testing Days 2012 before having to make my way to Tegel to catch the flight home. “Thinking and working agile in an unbending world” with Pete Walen was a talk I was not going to miss! Having spoken to Pete several times during the past few days, I was looking forward to hearing what he was going to say, and I was not disappointed. Pete started off by trying to separate the definitions of ‘Agile’ as in the methodology, and ‘agile’ as in the adjective by pronouncing them the ‘english’ and ‘american’ ways. So Agile pronounced (Ajyle) and agile pronounced (ajul). There was much confusion around what the hell he was talking about, although I thought it was quite clear. Agile – Software development methodology agile – Marked by ready ability to move with quick easy grace; Having a quick resourceful and adaptable character. Anyway, that aside (although it provided a few laughs during the presentation), the point was that many teams that claim to be ‘Agile’ but are not, in fact, ‘agile’ by nature. Implementing ‘Agile’ methodologies that are so prescriptive actually goes against the very nature of Agile development where a team should anticipate, adapt and explore. Pete made a valid point that very few companies intentionally put up roadblocks to impede work, so if work is being blocked/delayed, why? This is where being agile as a team pays off because the team can inspect what’s going on, explore options and adapt their processes. It is through experimentation (and that means trying and failing as well as trying and succeeding) that a team will improve and grow leading to focussing on what really needs to be done to achieve X. So, that was it, the last talk of our conference. I was gutted that we had to miss the closing keynote from Matt Heusser, as Matt was another person I had spoken too a few times during the conference, but the flight would not wait, and just as well we left when we did because the traffic was a nightmare! My Takeaway Triple from Day 3: Release often and release small – don’t leave stuff on the shelf Keep the meaning of the word ‘agile’ in mind when working in ‘Agile Look at testing as more of a skill than a role

Read the article
Encrypting your SQL Server Passwords in Powershell

- by laerte

A couple of months ago, a friend of mine who is now bewitched by the seemingly supernatural abilities of Powershell (+1 for the team) asked me what, initially, appeared to be a trivial question: "Laerte, I do not have the luxury of being able to work with my SQL servers through Windows Authentication, and I need a way to automatically pass my username and password. How would you suggest I do this?" Given that I knew he, like me, was using the SQLPSX modules (an open source project created by Chad Miller; a fantastic library of reusable functions and PowerShell scripts), I merrily replied, "Simply pass the Username and Password in SQLPSX functions". He rather pointed responded: "My friend, I might as well pass: Username-'Me'-password 'NowEverybodyKnowsMyPassword'" As I do have the pleasure of working with Windows Authentication, I had not really thought this situation though yet (and thank goodness I only revealed my temporary ignorance to a friend, and the embarrassment was minimized). After discussing this puzzle with Chad Miller, he showed me some code for saving passwords on SQL Server Tables, which he had demo'd in his Powershell ETL session at Tampa SQL Saturday (and you can download the scripts from here). The solution seemed to be pretty much ready to go, so I showed it to my Authentication-impoverished friend, only to discover that we were only half-way there: "That's almost what I want, but the details need to be stored in my local txt file, together with the names of the servers that I'll actually use the Powershell scripts on. Something like: Server1,UserName,Password Server2,UserName,Password" I thought about it for just a few milliseconds (Ha! Of course I'm not telling you how long it actually took me, I have to do my own marketing, after all) and the solution was finally ready. First , we have to download Library-StringCripto (with many thanks to Steven Hystad), which is composed of two functions: One for encryption and other for decryption, both of which are used to manage the password. If you want to know more about the library, you can see more details in the help functions. Next, we have to create a txt file with your encrypted passwords:$ServerName = "Server1" $UserName = "Login1" $Password = "Senha1" $PasswordToEncrypt = "YourPassword" $UserNameEncrypt = Write-EncryptedString -inputstring $UserName -Password $PasswordToEncrypt $PasswordEncrypt = Write-EncryptedString -inputstring $Password -Password $PasswordToEncrypt "$($Servername),$($UserNameEncrypt),$($PasswordEncrypt)" | Out-File c:\temp\ServersSecurePassword.txt -Append $ServerName = "Server2" $UserName = "Login2" $Password = "senha2" $PasswordToEncrypt = "YourPassword" $UserNameEncrypt = Write-EncryptedString -inputstring $UserName -Password $PasswordToEncrypt $PasswordEncrypt = Write-EncryptedString -inputstring $Password -Password $PasswordToEncrypt "$($Servername),$($UserNameEncrypt),$($PasswordEncrypt)" | Out-File c:\temp\ ServersSecurePassword.txt -Append .And in the c:\temp\ServersSecurePassword.txt file which we've just created, you will find your Username and Password, all neatly encrypted. Let's take a look at what the txt looks like: .and in case you're wondering, Server names, Usernames and Passwords are all separated by commas. Decryption is actually much more simple:Read-EncryptedString -InputString $EncryptString -password "YourPassword" (Just remember that the Password you're trying to decrypt must be exactly the same as the encrypted phrase.) Finally, just to show you how smooth this solution is, let's say I want to use the Invoke-DBMaint function from SQLPSX to perform a checkdb on a system database: it's just a case of split, decrypt and be happy!Get-Content c:\temp\ServerSecurePassword.txt | foreach { [array] $Split = ($_).split(",") Invoke-DBMaint -server $($Split[0]) -UserName (Read-EncryptedString -InputString $Split[1] -password "YourPassword" ) -Password (Read-EncryptedString -InputString $Split[2] -password "YourPassword" ) -Databases "SYSTEM" -Action "CHECK_DB" -ReportOn c:\Temp } This is why I love Powershell.

Read the article
Database Owner Conundrum

- by Johnm

Have you ever restored a database from a production environment on Server A into a development environment on Server B and had some items, such as Service Broker, mysteriously cease functioning? You might want to consider reviewing the database owner property of the database. The Scenario Recently, I was developing some messaging functionality that utilized the Service Broker feature of SQL Server in a development environment. Within the instance of the development environment resided two databases: One was a restored version of a production database that we will call "RestoreDB". The second database was a brand new database that has yet to exist in the production environment that we will call "DevDB". The goal is to setup a communication path between RestoreDB and DevDB that will later be implemented into the production database. After implementing all of the Service Broker objects that are required to communicate within a database as well as between two databases on the same instance I found myself a bit confounded. My testing was showing that the communication was successful when it was occurring internally within DevDB; but the communication between RestoreDB and DevDB did not appear to be working. Profiler to the rescue After carefully reviewing my code for any misspellings, missing commas or any other minor items that might be a syntactical cause of failure, I decided to launch Profiler to aid in the troubleshooting. After simulating the cross database messaging, I noticed the following error appearing in Profiler: An exception occurred while enqueueing a message in the target queue. Error: 33009, State: 2. The database owner SID recorded in the master database differs from the database owner SID recorded in database '[Database Name Here]'. You should correct this situation by resetting the owner of database '[Database Name Here]' using the ALTER AUTHORIZATION statement. Now, this error message is a helpful one. Not only does it identify the issue in plain language, it also provides a potential solution. An execution of the following query that utilizes the catalog view sys.transmission_queue revealed the same error message for each communication attempt: SELECT * FROM sys.transmission_queue; Seeing the situation as a learning opportunity I dove a bit deeper. Reviewing the database properties The owner of a specific database can be easily viewed by right-clicking the database in SQL Server Management Studio and selecting the "properties" option. The owner is listed on the "General" page of the properties screen. In my scenario, the database in the production server was created by Frank the DBA; therefore his server login appeared as the owner: "ServerName\Frank". While this is interesting information, it certainly doesn't tell me much in regard to the SID (security identifier) and its existence, or lack thereof, in the master database as the error suggested. I pulled together the following query to gather more interesting information: SELECT a.name , a.owner_sid , b.sid , b.name , b.type_desc FROM master.sys.databases a LEFT OUTER JOIN master.sys.server_principals b ON a.owner_sid = b.sid WHERE a.name not in ('master','tempdb','model','msdb'); This query also helped identify how many other user databases in the instance were experiencing the same issue. In this scenario, I saw that there were no matching SIDs in server_principals to the owner SID for my database. What login should be used as the database owner instead of Frank's? The system stored procedure sp_helplogins will provide a list of the valid logins that can be used. Here is an example of its use, revealing all available logins: EXEC sp_helplogins; Fixing a hole The error message stated that the recommended solution was to execute the ALTER AUTHORIZATION statement. The full statement for this scenario would appear as follows: ALTER AUTHORIZATION ON DATABASE:: [Database Name Here] TO [Login Name]; Another option is to execute the following statement using the sp_changedbowner system stored procedure; but please keep in mind that this stored procedure has been deprecated and will likely disappear in future versions of SQL Server: EXEC dbo.sp_changedbowner @loginname = [Login Name]; .And They Lived Happily Ever After Upon changing the database owner to an existing login and simulating the inner and cross database messaging the errors have ceased. More importantly, all messages sent through this feature now successfully complete their journey. I have added the ownership change to my restoration script for the development environment.

Read the article
Subterranean IL: Volatile

- by Simon Cooper

This time, we'll be having a look at the volatile. prefix instruction, and one of the differences between volatile in IL and C#. The volatile. prefix volatile is a tricky one, as there's varying levels of documentation on it. From what I can see, it has two effects: It prevents caching of the load or store value; rather than reading or writing to a cached version of the memory location (say, the processor register or cache), it forces the value to be loaded or stored at the 'actual' memory location, so it is then immediately visible to other threads. It forces a memory barrier at the prefixed instruction. This ensures instructions don't get re-ordered around the volatile instruction. This is slightly more complicated than it first seems, and only seems to matter on certain architectures. For more details, Joe Duffy has a blog post going into the details. For this post, I'll be concentrating on the first aspect of volatile. Caching field accesses To demonstrate this, I created a simple multithreaded IL program. It boils down to the following code: .class public Holder { .field public static class Holder holder .field public bool stop .method public static specialname void .cctor() { newobj instance void Holder::.ctor() stsfld class Holder Holder::holder ret }}.method private static void Main() { .entrypoint // Thread t = new Thread(new ThreadStart(DoWork)) // t.Start() // Thread.Sleep(2000) // Console.WriteLine("Stopping thread...") ldsfld class Holder Holder::holder ldc.i4.1 stfld bool Holder::stop call instance void [mscorlib]System.Threading.Thread::Join() ret}.method private static void DoWork() { ldsfld class Holder Holder::holder // while (!Holder.holder.stop) {} DoWork: dup ldfld bool Holder::stop brfalse DoWork pop ret} If you compile and run this code, you'll find that the call to Thread.Join() never returns - the DoWork spinlock is reading a cached version of Holder.stop, which is never being updated with the new value set by the Main method. Adding volatile to the ldfld fixes this: dupvolatile.ldfld bool Holder::stopbrfalse DoWork The volatile ldfld forces the field access to read direct from heap memory, which is then updated by the main thread, rather than using a cached copy. volatile in C# This highlights one of the differences between IL and C#. In IL, volatile only applies to the prefixed instruction, whereas in C#, volatile is specified on a field to indicate that all accesses to that field should be volatile (interestingly, there's no mention of the 'no caching' aspect of volatile in the C# spec; it only focuses on the memory barrier aspect). Furthermore, this information needs to be stored within the assembly somehow, as such a field might be accessed directly from outside the assembly, but there's no concept of a 'volatile field' in IL! How this information is stored with the field will be the subject of my next post.

Read the article
.NET Security Part 4

- by Simon Cooper

Finally, in this series, I am going to cover some of the security issues that can trip you up when using sandboxed appdomains. DISCLAIMER: I am not a security expert, and this is by no means an exhaustive list. If you actually are writing security-critical code, then get a proper security audit of your code by a professional. The examples below are just illustrations of the sort of things that can go wrong. 1. AppDomainSetup.ApplicationBase The most obvious one is the issue covered in the MSDN documentation on creating a sandbox, in step 3 – the sandboxed appdomain has the same ApplicationBase as the controlling appdomain. So let’s explore what happens when they are the same, and an exception is thrown. In the sandboxed assembly, Sandboxed.dll (IPlugin is an interface in a partially-trusted assembly, with a single MethodToDoThings on it): public class UntrustedPlugin : MarshalByRefObject, IPlugin { // implements IPlugin.MethodToDoThings() public void MethodToDoThings() { throw new EvilException(); } } [Serializable] internal class EvilException : Exception { public override string ToString() { // show we have read access to C:\Windows // read the first 5 directories Console.WriteLine("Pwned! Mwuahahah!"); foreach (var d in Directory.EnumerateDirectories(@"C:\Windows").Take(5)) { Console.WriteLine(d.FullName); } return base.ToString(); } } And in the controlling assembly: // what can possibly go wrong? AppDomainSetup appDomainSetup = new AppDomainSetup { ApplicationBase = AppDomain.CurrentDomain.SetupInformation.ApplicationBase } // only grant permissions to execute // and to read the application base, nothing else PermissionSet restrictedPerms = new PermissionSet(PermissionState.None); restrictedPerms.AddPermission( new SecurityPermission(SecurityPermissionFlag.Execution)); restrictedPerms.AddPermission( new FileIOPermission(FileIOPermissionAccess.Read, appDomainSetup.ApplicationBase); restrictedPerms.AddPermission( new FileIOPermission(FileIOPermissionAccess.pathDiscovery, appDomainSetup.ApplicationBase); // create the sandbox AppDomain sandbox = AppDomain.CreateDomain("Sandbox", null, appDomainSetup, restrictedPerms); // execute UntrustedPlugin in the sandbox // don't crash the application if the sandbox throws an exception IPlugin o = (IPlugin)sandbox.CreateInstanceFromAndUnwrap("Sandboxed.dll", "UntrustedPlugin"); try { o.MethodToDoThings() } catch (Exception e) { Console.WriteLine(e.ToString()); } And the result? Oops. We’ve allowed a class that should be sandboxed to execute code with fully-trusted permissions! How did this happen? Well, the key is the exact meaning of the ApplicationBase property: The application base directory is where the assembly manager begins probing for assemblies. When EvilException is thrown, it propagates from the sandboxed appdomain into the controlling assembly’s appdomain (as it’s marked as Serializable). When the exception is deserialized, the CLR finds and loads the sandboxed dll into the fully-trusted appdomain. Since the controlling appdomain’s ApplicationBase directory contains the sandboxed assembly, the CLR finds and loads the assembly into a full-trust appdomain, and the evil code is executed. So the problem isn’t exactly that the sandboxed appdomain’s ApplicationBase is the same as the controlling appdomain’s, it’s that the sandboxed dll was in such a place that the controlling appdomain could find it as part of the standard assembly resolution mechanism. The sandbox then forced the assembly to load in the controlling appdomain by throwing a serializable exception that propagated outside the sandbox. The easiest fix for this is to keep the sandbox ApplicationBase well away from the ApplicationBase of the controlling appdomain, and don’t allow the sandbox permissions to access the controlling appdomain’s ApplicationBase directory. If you do this, then the sandboxed assembly can’t be accidentally loaded into the fully-trusted appdomain, and the code can’t be executed. If the plugin does try to induce the controlling appdomain to load an assembly it shouldn’t, a SerializationException will be thrown when it tries to load the assembly to deserialize the exception, and no damage will be done. 2. Loading the sandboxed dll into the application appdomain As an extension of the previous point, you shouldn’t directly reference types or methods in the sandboxed dll from your application code. That loads the assembly into the fully-trusted appdomain, and from there code in the assembly could be executed. Instead, pull out methods you want the sandboxed dll to have into an interface or class in a partially-trusted assembly you control, and execute methods via that instead (similar to the example above with the IPlugin interface). If you need to have a look at the assembly before executing it in the sandbox, either examine the assembly using reflection from within the sandbox, or load the assembly into the Reflection-only context in the application’s appdomain. The code in assemblies in the reflection-only context can’t be executed, it can only be reflected upon, thus protecting your appdomain from malicious code. 3. Incorrectly asserting permissions You should only assert permissions when you are absolutely sure they’re safe. For example, this method allows a caller read-access to any file they call this method with, including your documents, any network shares, the C:\Windows directory, etc: [SecuritySafeCritical] public static string GetFileText(string filePath) { new FileIOPermission(FileIOPermissionAccess.Read, filePath).Assert(); return File.ReadAllText(filePath); } Be careful when asserting permissions, and ensure you’re not providing a loophole sandboxed dlls can use to gain access to things they shouldn’t be able to. Conclusion Hopefully, that’s given you an idea of some of the ways it’s possible to get past the .NET security system. As I said before, this post is not exhaustive, and you certainly shouldn’t base any security-critical applications on the contents of this blog post. What this series should help with is understanding the possibilities of the security system, and what all the security attributes and classes mean and what they are used for, if you were to use the security system in the future.

Read the article
Subterranean IL: The ThreadLocal type

- by Simon Cooper

I came across ThreadLocal<T> while I was researching ConcurrentBag. To look at it, it doesn't really make much sense. What's all those extra Cn classes doing in there? Why is there a GenericHolder<T,U,V,W> class? What's going on? However, digging deeper, it's a rather ingenious solution to a tricky problem. Thread statics Declaring that a variable is thread static, that is, values assigned and read from the field is specific to the thread doing the reading, is quite easy in .NET: [ThreadStatic] private static string s_ThreadStaticField; ThreadStaticAttribute is not a pseudo-custom attribute; it is compiled as a normal attribute, but the CLR has in-built magic, activated by that attribute, to redirect accesses to the field based on the executing thread's identity. TheadStaticAttribute provides a simple solution when you want to use a single field as thread-static. What if you want to create an arbitary number of thread static variables at runtime? Thread-static fields can only be declared, and are fixed, at compile time. Prior to .NET 4, you only had one solution - thread local data slots. This is a lesser-known function of Thread that has existed since .NET 1.1: LocalDataStoreSlot threadSlot = Thread.AllocateNamedDataSlot("slot1"); string value = "foo"; Thread.SetData(threadSlot, value); string gettedValue = (string)Thread.GetData(threadSlot); Each instance of LocalStoreDataSlot mediates access to a single slot, and each slot acts like a separate thread-static field. As you can see, using thread data slots is quite cumbersome. You need to keep track of LocalDataStoreSlot objects, it's not obvious how instances of LocalDataStoreSlot correspond to individual thread-static variables, and it's not type safe. It's also relatively slow and complicated; the internal implementation consists of a whole series of classes hanging off a single thread-static field in Thread itself, using various arrays, lists, and locks for synchronization. ThreadLocal<T> is far simpler and easier to use. ThreadLocal ThreadLocal provides an abstraction around thread-static fields that allows it to be used just like any other class; it can be used as a replacement for a thread-static field, it can be used in a List<ThreadLocal<T>>, you can create as many as you need at runtime. So what does it do? It can't just have an instance-specific thread-static field, because thread-static fields have to be declared as static, and so shared between all instances of the declaring type. There's something else going on here. The values stored in instances of ThreadLocal<T> are stored in instantiations of the GenericHolder<T,U,V,W> class, which contains a single ThreadStatic field (s_value) to store the actual value. This class is then instantiated with various combinations of the Cn types for generic arguments. In .NET, each separate instantiation of a generic type has its own static state. For example, GenericHolder<int,C0,C1,C2> has a completely separate s_value field to GenericHolder<int,C1,C14,C1>. This feature is (ab)used by ThreadLocal to emulate instance thread-static fields. Every time an instance of ThreadLocal is constructed, it is assigned a unique number from the static s_currentTypeId field using Interlocked.Increment, in the FindNextTypeIndex method. The hexadecimal representation of that number then defines the specific Cn types that instantiates the GenericHolder class. That instantiation is therefore 'owned' by that instance of ThreadLocal. This gives each instance of ThreadLocal its own ThreadStatic field through a specific unique instantiation of the GenericHolder class. Although GenericHolder has four type variables, the first one is always instantiated to the type stored in the ThreadLocal<T>. This gives three free type variables, each of which can be instantiated to one of 16 types (C0 to C15). This puts an upper limit of 4096 (163) on the number of ThreadLocal<T> instances that can be created for each value of T. That is, there can be a maximum of 4096 instances of ThreadLocal<string>, and separately a maximum of 4096 instances of ThreadLocal<object>, etc. However, there is an upper limit of 16384 enforced on the total number of ThreadLocal instances in the AppDomain. This is to stop too much memory being used by thousands of instantiations of GenericHolder<T,U,V,W>, as once a type is loaded into an AppDomain it cannot be unloaded, and will continue to sit there taking up memory until the AppDomain is unloaded. The total number of ThreadLocal instances created is tracked by the ThreadLocalGlobalCounter class. So what happens when either limit is reached? Firstly, to try and stop this limit being reached, it recycles GenericHolder type indexes of ThreadLocal instances that get disposed using the s_availableIndices concurrent stack. This allows GenericHolder instantiations of disposed ThreadLocal instances to be re-used. But if there aren't any available instantiations, then ThreadLocal falls back on a standard thread local slot using TLSHolder. This makes it very important to dispose of your ThreadLocal instances if you'll be using lots of them, so the type instantiations can be recycled. The previous way of creating arbitary thread-static variables, thread data slots, was slow, clunky, and hard to use. In comparison, ThreadLocal can be used just like any other type, and each instance appears from the outside to be a non-static thread-static variable. It does this by using the CLR type system to assign each instance of ThreadLocal its own instantiated type containing a thread-static field, and so delegating a lot of the bookkeeping that thread data slots had to do to the CLR type system itself! That's a very clever use of the CLR type system.

Read the article
Metrics - A little knowledge can be a dangerous thing (or 'Why you're not clever enough to interpret metrics data')

- by Jason Crease

At RedGate Software, I work on a .NET obfuscator called SmartAssembly. Various features of it use a database to store various things (exception reports, name-mappings, etc.) The user is given the option of using either a SQL-Server database (which requires them to have Microsoft SQL Server), or a Microsoft Access MDB file (which requires nothing). MDB is the default option, but power-users soon switch to using a SQL Server database because it offers better performance and data-sharing. In the fashionable spirit of optimization and metrics, an obvious product-management question is 'Which is the most popular? SQL Server or MDB?' We've collected data about this fact, using our 'Feature-Usage-Reporting' technology (available as part of SmartAssembly) and more recently our 'Application Metrics' technology: Parameter Number of users % of total users Number of sessions Number of usages SQL Server 28 19.0 8115 8115 MDB 114 77.6 1449 1449 (As a disclaimer, please note than SmartAssembly has far more than 132 users . This data is just a selection of one build) So, it would appear that SQL-Server is used by fewer users, but more often. Great. But here's why these numbers are useless to me: Only the original developers understand the data What does a single 'usage' of 'MDB' mean? Does this happen once per run? Once per option change? On clicking the 'Obfuscate Now' button? When running the command-line version or just from the UI version? Each question could skew the data 10-fold either way, and the answers only known by the developer that instrumented the application in the first place. In other words, only the original developer can interpret the data - product-managers cannot interpret the data unaided. Most of the data is from uninterested users About half of people who download and run a free-trial from the internet quit it almost immediately. Only a small fraction use it sufficiently to make informed choices. Since the MDB option is the default one, we don't know how many of those 114 were people CHOOSING to use the MDB, or how many were JUST HAPPENING to use this MDB default for their 20-second trial. This is a problem we see across all our metrics: Are people are using X because it's the default or are they using X because they want to use X? We need to segment the data further - asking what percentage of each percentage meet our criteria for an 'established user' or 'informed user'. You end up spending hours writing sophisticated and dubious SQL queries to segment the data further. Not fun. You can't find out why they used this feature Metrics can answer the when and what, but not the why. Why did people use feature X? If you're anything like me, you often click on random buttons in unfamiliar applications just to explore the feature-set. If we listened uncritically to metrics at RedGate, we would eliminate the most-important and more-complex features which people actually buy the software for, leaving just big buttons on the main page and the About-Box. "Ah, that's interesting!" rather than "Ah, that's actionable!" People do love data. Did you know you eat 1201 chickens in a lifetime? But just 4 cows? Interesting, but useless. Often metrics give you a nice number: '5.8% of users have 3 or more monitors' . But unless the statistic is both SUPRISING and ACTIONABLE, it's useless. Most metrics are collected, reviewed with lots of cooing. and then forgotten. Unless a piece-of-data could change things, it's useless collecting it. People get obsessed with significance levels The first things that lots of people do with this data is do a t-test to get a significance level ("Hey! We know with 99.64% confidence that people prefer SQL Server to MDBs!") Believe me: other causes of error/misinterpretation in your data are FAR more significant than your t-test could ever comprehend. Confirmation bias prevents objectivity If the data appears to match our instinct, we feel satisfied and move on. If it doesn't, we suspect the data and dig deeper, plummeting down a rabbit-hole of segmentation and filtering until we give-up and move-on. Data is only useful if it can change our preconceptions. Do you trust this dodgy data more than your own understanding, knowledge and intelligence? I don't. There's always multiple plausible ways to interpret/action any data Let's say we segment the above data, and get this data: Post-trial users (i.e. those using a paid version after the 14-day free-trial is over): Parameter Number of users % of total users Number of sessions Number of usages SQL Server 13 9.0 1115 1115 MDB 5 4.2 449 449 Trial users: Parameter Number of users % of total users Number of sessions Number of usages SQL Server 15 10.0 7000 7000 MDB 114 77.6 1000 1000 How do you interpret this data? It's one of: Mostly SQL Server users buy our software. People who can't afford SQL Server tend to be unable to afford or unwilling to buy our software. Therefore, ditch MDB-support. Our MDB support is so poor and buggy that our massive MDB user-base doesn't buy it. Therefore, spend loads of money improving it, and think about ditching SQL-Server support. People 'graduate' naturally from MDB to SQL Server as they use the software more. Things are fine the way they are. We're marketing the tool wrong. The large number of MDB users represent uninformed downloaders. Tell marketing to aggressively target SQL Server users. To choose an interpretation you need to segment again. And again. And again, and again. Opting-out is correlated with feature-usage Metrics tends to be opt-in. This skews the data even further. Between 5% and 30% of people choose to opt-in to metrics (often called 'customer improvement program' or something like that). Casual trial-users who are uninterested in your product or company are less likely to opt-in. This group is probably also likely to be MDB users. How much does this skew your data by? Who knows? It's not all doom and gloom. There are some things metrics can answer well. Environment facts. How many people have 3 monitors? Have Windows 7? Have .NET 4 installed? Have Japanese Windows? Minor optimizations. Is the text-box big enough for average user-input? Performance data. How long does our app take to start? How many databases does the average user have on their server? As you can see, questions about who-the-user-is rather than what-the-user-does are easier to answer and action. Conclusion Use SmartAssembly. If not for the metrics (called 'Feature-Usage-Reporting'), then at least for the obfuscation/error-reporting. Data raises more questions than it answers. Questions about environment are the easiest to answer.

Read the article
Why enumerator structs are a really bad idea (redux)

- by Simon Cooper

My previous blog post went into some detail as to why calling MoveNext on a BCL generic collection enumerator didn't quite do what you thought it would. This post covers the Reset method. To recap, here's the simple wrapper around a linked list enumerator struct from my previous post (minus the readonly on the enumerator variable): sealed class EnumeratorWrapper : IEnumerator<int> { private LinkedList<int>.Enumerator m_Enumerator; public EnumeratorWrapper(LinkedList<int> linkedList) { m_Enumerator = linkedList.GetEnumerator(); } public int Current { get { return m_Enumerator.Current; } } object System.Collections.IEnumerator.Current { get { return Current; } } public bool MoveNext() { return m_Enumerator.MoveNext(); } public void Reset() { ((System.Collections.IEnumerator)m_Enumerator).Reset(); } public void Dispose() { m_Enumerator.Dispose(); } } If you have a look at the Reset method, you'll notice I'm having to cast to IEnumerator to be able to call Reset on m_Enumerator. This is because the implementation of LinkedList<int>.Enumerator.Reset, and indeed of all the other Reset methods on the BCL generic collection enumerators, is an explicit interface implementation. However, IEnumerator is a reference type. LinkedList<int>.Enumerator is a value type. That means, in order to call the reset method at all, the enumerator has to be boxed. And the IL confirms this: .method public hidebysig newslot virtual final instance void Reset() cil managed { .maxstack 8 L_0000: nop L_0001: ldarg.0 L_0002: ldfld valuetype [System]System.Collections.Generic.LinkedList`1/Enumerator<int32> EnumeratorWrapper::m_Enumerator L_0007: box [System]System.Collections.Generic.LinkedList`1/Enumerator<int32> L_000c: callvirt instance void [mscorlib]System.Collections.IEnumerator::Reset() L_0011: nop L_0012: ret } On line 0007, we're doing a box operation, which copies the enumerator to a reference object on the heap, then on line 000c calling Reset on this boxed object. So m_Enumerator in the wrapper class is not modified by the call the Reset. And this is the only way to call the Reset method on this variable (without using reflection). Therefore, the only way that the collection enumerator struct can be used safely is to store them as a boxed IEnumerator<T>, and not use them as value types at all.

Read the article
Hadoop, NOSQL, and the Relational Model

- by Phil Factor

(Guest Editorial for the IT Pro/SysAdmin Newsletter)Whereas Relational Databases fit the world of commerce like a glove, it is useless to pretend that they are a perfect fit for all human endeavours. Although, with SQL Server, we’ve made great strides with indexing text, in processing spatial data and processing markup, there is still a problem in dealing efficiently with large volumes of ephemeral semi-structured data. Key-value stores such as Cassandra, Project Voldemort, and Riak are of great value for ephemeral data, and seem of equal value as a data-feed that provides aggregations to an RDBMS. However, the Document databases such as MongoDB and CouchDB are ideal for semi-structured data for which no fixed schema exists; analytics and logging are obvious examples. NoSQL products, such as MongoDB, tackle the semi-structured data problem with panache. MongoDB is designed with a simple document-oriented data model that scales horizontally across multiple servers. It doesn’t impose a schema, and relies on the application to enforce the data structure. This is another take on the old ‘EAV’ problem (where you don’t know in advance all the attributes of a particular entity) It uses a clever replica set design that allows automatic failover, and uses journaling for data durability. It allows indexing and ad-hoc querying. However, for SQL Server users, the obvious choice for handling semi-structured data is Apache Hadoop. There will soon be an ODBC Driver for Apache Hive .and an Add-in for Excel. Additionally, there are now two Hadoop-based connectors for SQL Server; the Apache Hadoop connector for SQL Server 2008 R2, and the SQL Server Parallel Data Warehouse (PDW) connector. We can connect to Hadoop process the semi-structured data and then store it in SQL Server. For one steeped in the culture of Relational SQL Databases, I might be expected to throw up my hands in the air in a gesture of contempt for a technology that was, judging by the overblown journalism on the subject, about to make my own profession as archaic as the Saggar makers bottom knocker (a potter’s assistant who helped the saggar maker to make the bottom of the saggar by placing clay in a metal hoop and bashing it). However, on the contrary, I find that I'm delighted with the advances made by the NoSQL databases in the past few years. Having the flow of ideas from the NoSQL providers will knock any trace of complacency out of the providers of Relational Databases and inspire them into back-fitting some features, such as horizontal scaling, with sharding and automatic failover into SQL-based RDBMSs. It will do the breed a power of good to benefit from all this lateral thinking.

Read the article
The Birth of SSAS Compare

- by Red Gate Software BI Tools Team

Noemi Moreno, Red Gate Business Intelligence Specialist Software vendors – even Microsoft – tend to forget about the needs of business intelligence developers. We are a rare and rather invisible species. For example, BIDS remained in VS 2008 until SQL Server 2012. It took until this release before we got something as simple as an “undo” function. Before I joined Red Gate as a BI specialist, I worked on SQL Development. I’ll never forget the time I discovered Red Gate’s SQL Compare tool and how it reduced the task of preparing a database release from a couple of days to ten minutes. When I moved to SSAS, MDX and cubes, I became frustrated with the deployment process because I couldn’t find a tool that made Cube releases as easy as they are with SQL Compare. This became my quest. I pitched the idea to a few people in Red Gate’s regular Down Tools Week, when everyone puts down their day-to-day tasks and works on their own projects. My task was to reason with a roomful of cynical developers, hardened to the blandishments of project managers, for help to develop a tool that would compare two different SSAS databases and create the script to process only the objects that needed processing, thereby reducing release time to only a few minutes. I walked to the podium and gave them the full story of the distressed BI specialists, doomed to spend tedious hours preparing deployment scripts. A few developers recovered from their torpor to cast a languid eye at my presentation. It wasn’t enough. In a sudden impulse, I blurted out a promise to perform a flamenco dance for just the team if the tool was able to successfully compare two SSAS databases and generate a script by the end of the week. I was lucky enough that some of them believed me and jumped in: David Pond (Dev), Matt Burton (Dev), Tilman Bregler (Dev), Shobana Sekar (Test), Ruchija Raj (Test), Nick Sutherland (Product Manager) and Irma Tanovic (BI). They didn’t know that Irma and I would be away on a conference in Amsterdam and would leave them without our support. But to my surprise, they had a working tool by the time we came back – basic, and with a few bugs, but a working tool nonetheless! Seeing it compare a very basic SSAS database, detect the changes and generate the scripts was amazing! Something that normally takes half a day was done in under a minute. Since then, a few months have passed and a BI Tools team has been created at Red Gate to work full time on BI tools for BI developers, starting with SSAS Compare. How cool is that? So download the free beta and give us your feedback. And the flamenco? I still need to deliver that. Tilman reminds me every day! I need to get the full flamenco costume.

Read the article
Some Problems Can't Be Outsourced

- by mikef

More and more companies are becoming attracted to the idea of Infrastructure as a Service (or IaaS). It would seem that you can outsource the provisioning and management of your services, encompassing everything from Email, through to your servers, workstations and software, all the way down to your LAN and internet services. This type of outsourcing can be a very attractive option for companies who have tight budgets who are short of technical skills or don't have the means to provide long-term IT support. Essentially, they can outsource your services at low short-term costs that are knowable and controllable, are quickly and easily scalable, and generate a minimum of hassle for your internal staff. If you want to get a sophisticated IT infrastructure set up in a hurry without the usual high buy-in costs, or the task of finding and hiring the right specialists. It would seem the way to go, particularly when their salesmen are hypnotizing you with oleaginous phrases such as "we are closely aligned with our client organization's core business requirements, providing agile services". It sounds too good to be true, and so it is. Whereas the costs will have initially been calculated on the annual renewal fees and service fees for ongoing support, there are other charges too which aren't so obvious. It can end up costing far more than the conventional solution once you take into account the extra costs, the fees for customization and upgrades. The Total Cost of Ownership (TCO) only becomes apparent when it is too late to extract the company easily from the arrangement. After a few years, these annual fees can add up to more than the initial cost of implementing a traditional in-house system. Worse than that is that you can then lose your power to determine your priorities: When you become reliant on this company, with its own schedule of priorities, to implement every change, however simple, you have effectively lost control of your technical infrastructure. This will make senior management very nervous. There is definitely a requirement for this sort of service. If you urgently need an exceptionally high class of service or more expertise than you currently possess, then outsourcing is probably for you. You and your IT colleagues will always have something to do, be it user assistance, smoothing out integrations with an external provider, or working on something entirely new. Heck, if you outsource to IBM, the SysAdmins can go along for the ride and polish their expertise. What you need to figure out is how much your time is worth, because time is ultimately all that outsourcing will buy you and your organization. Now you just need to convince your nervous CEO. Cheers, Michael

Read the article
What is the coolest thing you can do in <10 lines of simple code? Help me inspire beginners!

- by Tom Ritter

I'm looking for the coolest thing you can do in a few lines of simple code. I'm sure you can write a Mandelbrot set in Haskell in 15 lines but it's difficult to follow. My goal is to inspire students that programming is cool. We know that programming is cool because you can create anything you imagine - it's the ultimate creative outlet. I want to inspire these beginners and get them over as many early-learning humps as I can. Now, my reasons are selfish. I'm teaching an Intro to Computing course to a group of 60 half-engineering, half business majors; all freshmen. They are the students who came from underprivileged High schools. From my past experience, the group is generally split as follows: a few rock-stars, some who try very hard and kind of get it, the few who try very hard and barely get it, and the few who don't care. I want to reach as many of these groups as effectively as I can. Here's an example of how I'd use a computer program to teach: Here's an example of what I'm looking for: a 1-line VBS script to get your computer to talk to you: CreateObject("sapi.spvoice").Speak InputBox("Enter your text","Talk it") I could use this to demonstrate order of operations. I'd show the code, let them play with it, then explain that There's a lot going on in that line, but the computer can make sense of it, because it knows the rules. Then I'd show them something like this: 4(5*5) / 10 + 9(.25 + .75) And you can see that first I need to do is (5*5). Then I can multiply for 4. And now I've created the Object. Dividing by 10 is the same as calling Speak - I can't Speak before I have an object, and I can't divide before I have 100. Then on the other side I first create an InputBox with some instructions for how to display it. When I hit enter on the input box it evaluates or "returns" whatever I entered. (Hint: 'oooooo' makes a funny sound) So when I say Speak, the right side is what to Speak. And I get that from the InputBox. So when you do several things on a line, like: x = 14 + y; You need to be aware of the order of things. First we add 14 and y. Then we put the result (what it evaluates to, or returns) into x. That's my goal, to have a bunch of these cool examples to demonstrate and teach the class while they have fun. I tried this example on my roommate and while I may not use this as the first lesson, she liked it and learned something. Some cool mathematica programs that make beautiful graphs or shapes that are easy to understand would be good ideas and I'm going to look into those. Here are some complicated actionscript examples but that's a bit too advanced and I can't teach flash. What other ideas do you have?

Read the article
Zen and the Art of File and Folder Organization

- by Mark Virtue

Is your desk a paragon of neatness, or does it look like a paper-bomb has gone off? If you’ve been putting off getting organized because the task is too huge or daunting, or you don’t know where to start, we’ve got 40 tips to get you on the path to zen mastery of your filing system. For all those readers who would like to get their files and folders organized, or, if they’re already organized, better organized—we have compiled a complete guide to getting organized and staying organized, a comprehensive article that will hopefully cover every possible tip you could want. Signs that Your Computer is Poorly Organized If your computer is a mess, you’re probably already aware of it. But just in case you’re not, here are some tell-tale signs: Your Desktop has over 40 icons on it “My Documents” contains over 300 files and 60 folders, including MP3s and digital photos You use the Windows’ built-in search facility whenever you need to find a file You can’t find programs in the out-of-control list of programs in your Start Menu You save all your Word documents in one folder, all your spreadsheets in a second folder, etc Any given file that you’re looking for may be in any one of four different sets of folders But before we start, here are some quick notes: We’re going to assume you know what files and folders are, and how to create, save, rename, copy and delete them The organization principles described in this article apply equally to all computer systems. However, the screenshots here will reflect how things look on Windows (usually Windows 7). We will also mention some useful features of Windows that can help you get organized. Everyone has their own favorite methodology of organizing and filing, and it’s all too easy to get into “My Way is Better than Your Way” arguments. The reality is that there is no perfect way of getting things organized. When I wrote this article, I tried to keep a generalist and objective viewpoint. I consider myself to be unusually well organized (to the point of obsession, truth be told), and I’ve had 25 years experience in collecting and organizing files on computers. So I’ve got a lot to say on the subject. But the tips I have described here are only one way of doing it. Hopefully some of these tips will work for you too, but please don’t read this as any sort of “right” way to do it. At the end of the article we’ll be asking you, the reader, for your own organization tips. Why Bother Organizing At All? For some, the answer to this question is self-evident. And yet, in this era of powerful desktop search software (the search capabilities built into the Windows Vista and Windows 7 Start Menus, and third-party programs like Google Desktop Search), the question does need to be asked, and answered. I have a friend who puts every file he ever creates, receives or downloads into his My Documents folder and doesn’t bother filing them into subfolders at all. He relies on the search functionality built into his Windows operating system to help him find whatever he’s looking for. And he always finds it. He’s a Search Samurai. For him, filing is a waste of valuable time that could be spent enjoying life! It’s tempting to follow suit. On the face of it, why would anyone bother to take the time to organize their hard disk when such excellent search software is available? Well, if all you ever want to do with the files you own is to locate and open them individually (for listening, editing, etc), then there’s no reason to ever bother doing one scrap of organization. But consider these common tasks that are not achievable with desktop search software: Find files manually. Often it’s not convenient, speedy or even possible to utilize your desktop search software to find what you want. It doesn’t work 100% of the time, or you may not even have it installed. Sometimes its just plain faster to go straight to the file you want, if you know it’s in a particular sub-folder, rather than trawling through hundreds of search results. Find groups of similar files (e.g. all your “work” files, all the photos of your Europe holiday in 2008, all your music videos, all the MP3s from Dark Side of the Moon, all your letters you wrote to your wife, all your tax returns). Clever naming of the files will only get you so far. Sometimes it’s the date the file was created that’s important, other times it’s the file format, and other times it’s the purpose of the file. How do you name a collection of files so that they’re easy to isolate based on any of the above criteria? Short answer, you can’t. Move files to a new computer. It’s time to upgrade your computer. How do you quickly grab all the files that are important to you? Or you decide to have two computers now – one for home and one for work. How do you quickly isolate only the work-related files to move them to the work computer? Synchronize files to other computers. If you have more than one computer, and you need to mirror some of your files onto the other computer (e.g. your music collection), then you need a way to quickly determine which files are to be synced and which are not. Surely you don’t want to synchronize everything? Choose which files to back up. If your backup regime calls for multiple backups, or requires speedy backups, then you’ll need to be able to specify which files are to be backed up, and which are not. This is not possible if they’re all in the same folder. Finally, if you’re simply someone who takes pleasure in being organized, tidy and ordered (me! me!), then you don’t even need a reason. Being disorganized is simply unthinkable. Tips on Getting Organized Here we present our 40 best tips on how to get organized. Or, if you’re already organized, to get better organized. Tip #1. Choose Your Organization System Carefully The reason that most people are not organized is that it takes time. And the first thing that takes time is deciding upon a system of organization. This is always a matter of personal preference, and is not something that a geek on a website can tell you. You should always choose your own system, based on how your own brain is organized (which makes the assumption that your brain is, in fact, organized). We can’t instruct you, but we can make suggestions: You may want to start off with a system based on the users of the computer. i.e. “My Files”, “My Wife’s Files”, My Son’s Files”, etc. Inside “My Files”, you might then break it down into “Personal” and “Business”. You may then realize that there are overlaps. For example, everyone may want to share access to the music library, or the photos from the school play. So you may create another folder called “Family”, for the “common” files. You may decide that the highest-level breakdown of your files is based on the “source” of each file. In other words, who created the files. You could have “Files created by ME (business or personal)”, “Files created by people I know (family, friends, etc)”, and finally “Files created by the rest of the world (MP3 music files, downloaded or ripped movies or TV shows, software installation files, gorgeous desktop wallpaper images you’ve collected, etc).” This system happens to be the one I use myself. See below: Mark is for files created by meVC is for files created by my company (Virtual Creations)Others is for files created by my friends and familyData is the rest of the worldAlso, Settings is where I store the configuration files and other program data files for my installed software (more on this in tip #34, below). Each folder will present its own particular set of requirements for further sub-organization. For example, you may decide to organize your music collection into sub-folders based on the artist’s name, while your digital photos might get organized based on the date they were taken. It can be different for every sub-folder! Another strategy would be based on “currentness”. Files you have yet to open and look at live in one folder. Ones that have been looked at but not yet filed live in another place. Current, active projects live in yet another place. All other files (your “archive”, if you like) would live in a fourth folder. (And of course, within that last folder you’d need to create a further sub-system based on one of the previous bullet points). Put some thought into this – changing it when it proves incomplete can be a big hassle! Before you go to the trouble of implementing any system you come up with, examine a wide cross-section of the files you own and see if they will all be able to find a nice logical place to sit within your system. Tip #2. When You Decide on Your System, Stick to It! There’s nothing more pointless than going to all the trouble of creating a system and filing all your files, and then whenever you create, receive or download a new file, you simply dump it onto your Desktop. You need to be disciplined – forever! Every new file you get, spend those extra few seconds to file it where it belongs! Otherwise, in just a month or two, you’ll be worse off than before – half your files will be organized and half will be disorganized – and you won’t know which is which! Tip #3. Choose the Root Folder of Your Structure Carefully Every data file (document, photo, music file, etc) that you create, own or is important to you, no matter where it came from, should be found within one single folder, and that one single folder should be located at the root of your C: drive (as a sub-folder of C:\). In other words, do not base your folder structure in standard folders like “My Documents”. If you do, then you’re leaving it up to the operating system engineers to decide what folder structure is best for you. And every operating system has a different system! In Windows 7 your files are found in C:\Users\YourName, whilst on Windows XP it was C:\Documents and Settings\YourName\My Documents. In UNIX systems it’s often /home/YourName. These standard default folders tend to fill up with junk files and folders that are not at all important to you. “My Documents” is the worst offender. Every second piece of software you install, it seems, likes to create its own folder in the “My Documents” folder. These folders usually don’t fit within your organizational structure, so don’t use them! In fact, don’t even use the “My Documents” folder at all. Allow it to fill up with junk, and then simply ignore it. It sounds heretical, but: Don’t ever visit your “My Documents” folder! Remove your icons/links to “My Documents” and replace them with links to the folders you created and you care about! Create your own file system from scratch! Probably the best place to put it would be on your D: drive – if you have one. This way, all your files live on one drive, while all the operating system and software component files live on the C: drive – simply and elegantly separated. The benefits of that are profound. Not only are there obvious organizational benefits (see tip #10, below), but when it comes to migrate your data to a new computer, you can (sometimes) simply unplug your D: drive and plug it in as the D: drive of your new computer (this implies that the D: drive is actually a separate physical disk, and not a partition on the same disk as C:). You also get a slight speed improvement (again, only if your C: and D: drives are on separate physical disks). Warning: From tip #12, below, you will see that it’s actually a good idea to have exactly the same file system structure – including the drive it’s filed on – on all of the computers you own. So if you decide to use the D: drive as the storage system for your own files, make sure you are able to use the D: drive on all the computers you own. If you can’t ensure that, then you can still use a clever geeky trick to store your files on the D: drive, but still access them all via the C: drive (see tip #17, below). If you only have one hard disk (C:), then create a dedicated folder that will contain all your files – something like C:\Files. The name of the folder is not important, but make it a single, brief word. There are several reasons for this: When creating a backup regime, it’s easy to decide what files should be backed up – they’re all in the one folder! If you ever decide to trade in your computer for a new one, you know exactly which files to migrate You will always know where to begin a search for any file If you synchronize files with other computers, it makes your synchronization routines very simple. It also causes all your shortcuts to continue to work on the other machines (more about this in tip #24, below). Once you’ve decided where your files should go, then put all your files in there – Everything! Completely disregard the standard, default folders that are created for you by the operating system (“My Music”, “My Pictures”, etc). In fact, you can actually relocate many of those folders into your own structure (more about that below, in tip #6). The more completely you get all your data files (documents, photos, music, etc) and all your configuration settings into that one folder, then the easier it will be to perform all of the above tasks. Once this has been done, and all your files live in one folder, all the other folders in C:\ can be thought of as “operating system” folders, and therefore of little day-to-day interest for us. Here’s a screenshot of a nicely organized C: drive, where all user files are located within the \Files folder: Tip #4. Use Sub-Folders This would be our simplest and most obvious tip. It almost goes without saying. Any organizational system you decide upon (see tip #1) will require that you create sub-folders for your files. Get used to creating folders on a regular basis. Tip #5. Don’t be Shy About Depth Create as many levels of sub-folders as you need. Don’t be scared to do so. Every time you notice an opportunity to group a set of related files into a sub-folder, do so. Examples might include: All the MP3s from one music CD, all the photos from one holiday, or all the documents from one client. It’s perfectly okay to put files into a folder called C:\Files\Me\From Others\Services\WestCo Bank\Statements\2009. That’s only seven levels deep. Ten levels is not uncommon. Of course, it’s possible to take this too far. If you notice yourself creating a sub-folder to hold only one file, then you’ve probably become a little over-zealous. On the other hand, if you simply create a structure with only two levels (for example C:\Files\Work) then you really haven’t achieved any level of organization at all (unless you own only six files!). Your “Work” folder will have become a dumping ground, just like your Desktop was, with most likely hundreds of files in it. Tip #6. Move the Standard User Folders into Your Own Folder Structure Most operating systems, including Windows, create a set of standard folders for each of its users. These folders then become the default location for files such as documents, music files, digital photos and downloaded Internet files. In Windows 7, the full list is shown below: Some of these folders you may never use nor care about (for example, the Favorites folder, if you’re not using Internet Explorer as your browser). Those ones you can leave where they are. But you may be using some of the other folders to store files that are important to you. Even if you’re not using them, Windows will still often treat them as the default storage location for many types of files. When you go to save a standard file type, it can become annoying to be automatically prompted to save it in a folder that’s not part of your own file structure. But there’s a simple solution: Move the folders you care about into your own folder structure! If you do, then the next time you go to save a file of the corresponding type, Windows will prompt you to save it in the new, moved location. Moving the folders is easy. Simply drag-and-drop them to the new location. Here’s a screenshot of the default My Music folder being moved to my custom personal folder (Mark): Tip #7. Name Files and Folders Intelligently This is another one that almost goes without saying, but we’ll say it anyway: Do not allow files to be created that have meaningless names like Document1.doc, or folders called New Folder (2). Take that extra 20 seconds and come up with a meaningful name for the file/folder – one that accurately divulges its contents without repeating the entire contents in the name. Tip #8. Watch Out for Long Filenames Another way to tell if you have not yet created enough depth to your folder hierarchy is that your files often require really long names. If you need to call a file Johnson Sales Figures March 2009.xls (which might happen to live in the same folder as Abercrombie Budget Report 2008.xls), then you might want to create some sub-folders so that the first file could be simply called March.xls, and living in the Clients\Johnson\Sales Figures\2009 folder. A well-placed file needs only a brief filename! Tip #9. Use Shortcuts! Everywhere! This is probably the single most useful and important tip we can offer. A shortcut allows a file to be in two places at once. Why would you want that? Well, the file and folder structure of every popular operating system on the market today is hierarchical. This means that all objects (files and folders) always live within exactly one parent folder. It’s a bit like a tree. A tree has branches (folders) and leaves (files). Each leaf, and each branch, is supported by exactly one parent branch, all the way back to the root of the tree (which, incidentally, is exactly why C:\ is called the “root folder” of the C: drive). That hard disks are structured this way may seem obvious and even necessary, but it’s only one way of organizing data. There are others: Relational databases, for example, organize structured data entirely differently. The main limitation of hierarchical filing structures is that a file can only ever be in one branch of the tree – in only one folder – at a time. Why is this a problem? Well, there are two main reasons why this limitation is a problem for computer users: The “correct” place for a file, according to our organizational rationale, is very often a very inconvenient place for that file to be located. Just because it’s correctly filed doesn’t mean it’s easy to get to. Your file may be “correctly” buried six levels deep in your sub-folder structure, but you may need regular and speedy access to this file every day. You could always move it to a more convenient location, but that would mean that you would need to re-file back to its “correct” location it every time you’d finished working on it. Most unsatisfactory. A file may simply “belong” in two or more different locations within your file structure. For example, say you’re an accountant and you have just completed the 2009 tax return for John Smith. It might make sense to you to call this file 2009 Tax Return.doc and file it under Clients\John Smith. But it may also be important to you to have the 2009 tax returns from all your clients together in the one place. So you might also want to call the file John Smith.doc and file it under Tax Returns\2009. The problem is, in a purely hierarchical filing system, you can’t put it in both places. Grrrrr! Fortunately, Windows (and most other operating systems) offers a way for you to do exactly that: It’s called a “shortcut” (also known as an “alias” on Macs and a “symbolic link” on UNIX systems). Shortcuts allow a file to exist in one place, and an icon that represents the file to be created and put anywhere else you please. In fact, you can create a dozen such icons and scatter them all over your hard disk. Double-clicking on one of these icons/shortcuts opens up the original file, just as if you had double-clicked on the original file itself. Consider the following two icons: The one on the left is the actual Word document, while the one on the right is a shortcut that represents the Word document. Double-clicking on either icon will open the same file. There are two main visual differences between the icons: The shortcut will have a small arrow in the lower-left-hand corner (on Windows, anyway) The shortcut is allowed to have a name that does not include the file extension (the “.docx” part, in this case) You can delete the shortcut at any time without losing any actual data. The original is still intact. All you lose is the ability to get to that data from wherever the shortcut was. So why are shortcuts so great? Because they allow us to easily overcome the main limitation of hierarchical file systems, and put a file in two (or more) places at the same time. You will always have files that don’t play nice with your organizational rationale, and can’t be filed in only one place. They demand to exist in two places. Shortcuts allow this! Furthermore, they allow you to collect your most often-opened files and folders together in one spot for convenient access. The cool part is that the original files stay where they are, safe forever in their perfectly organized location. So your collection of most often-opened files can – and should – become a collection of shortcuts! If you’re still not convinced of the utility of shortcuts, consider the following well-known areas of a typical Windows computer: The Start Menu (and all the programs that live within it) The Quick Launch bar (or the Superbar in Windows 7) The “Favorite folders” area in the top-left corner of the Windows Explorer window (in Windows Vista or Windows 7) Your Internet Explorer Favorites or Firefox Bookmarks Each item in each of these areas is a shortcut! Each of those areas exist for one purpose only: For convenience – to provide you with a collection of the files and folders you access most often. It should be easy to see by now that shortcuts are designed for one single purpose: To make accessing your files more convenient. Each time you double-click on a shortcut, you are saved the hassle of locating the file (or folder, or program, or drive, or control panel icon) that it represents. Shortcuts allow us to invent a golden rule of file and folder organization: “Only ever have one copy of a file – never have two copies of the same file. Use a shortcut instead” (this rule doesn’t apply to copies created for backup purposes, of course!) There are also lesser rules, like “don’t move a file into your work area – create a shortcut there instead”, and “any time you find yourself frustrated with how long it takes to locate a file, create a shortcut to it and place that shortcut in a convenient location.” So how to we create these massively useful shortcuts? There are two main ways: “Copy” the original file or folder (click on it and type Ctrl-C, or right-click on it and select Copy): Then right-click in an empty area of the destination folder (the place where you want the shortcut to go) and select Paste shortcut: Right-drag (drag with the right mouse button) the file from the source folder to the destination folder. When you let go of the mouse button at the destination folder, a menu pops up: Select Create shortcuts here. Note that when shortcuts are created, they are often named something like Shortcut to Budget Detail.doc (windows XP) or Budget Detail – Shortcut.doc (Windows 7). If you don’t like those extra words, you can easily rename the shortcuts after they’re created, or you can configure Windows to never insert the extra words in the first place (see our article on how to do this). And of course, you can create shortcuts to folders too, not just to files! Bottom line: Whenever you have a file that you’d like to access from somewhere else (whether it’s convenience you’re after, or because the file simply belongs in two places), create a shortcut to the original file in the new location. Tip #10. Separate Application Files from Data Files Any digital organization guru will drum this rule into you. Application files are the components of the software you’ve installed (e.g. Microsoft Word, Adobe Photoshop or Internet Explorer). Data files are the files that you’ve created for yourself using that software (e.g. Word Documents, digital photos, emails or playlists). Software gets installed, uninstalled and upgraded all the time. Hopefully you always have the original installation media (or downloaded set-up file) kept somewhere safe, and can thus reinstall your software at any time. This means that the software component files are of little importance. Whereas the files you have created with that software is, by definition, important. It’s a good rule to always separate unimportant files from important files. So when your software prompts you to save a file you’ve just created, take a moment and check out where it’s suggesting that you save the file. If it’s suggesting that you save the file into the same folder as the software itself, then definitely don’t follow that suggestion. File it in your own folder! In fact, see if you can find the program’s configuration option that determines where files are saved by default (if it has one), and change it. Tip #11. Organize Files Based on Purpose, Not on File Type If you have, for example a folder called Work\Clients\Johnson, and within that folder you have two sub-folders, Word Documents and Spreadsheets (in other words, you’re separating “.doc” files from “.xls” files), then chances are that you’re not optimally organized. It makes little sense to organize your files based on the program that created them. Instead, create your sub-folders based on the purpose of the file. For example, it would make more sense to create sub-folders called Correspondence and Financials. It may well be that all the files in a given sub-folder are of the same file-type, but this should be more of a coincidence and less of a design feature of your organization system. Tip #12. Maintain the Same Folder Structure on All Your Computers In other words, whatever organizational system you create, apply it to every computer that you can. There are several benefits to this: There’s less to remember. No matter where you are, you always know where to look for your files If you copy or synchronize files from one computer to another, then setting up the synchronization job becomes very simple Shortcuts can be copied or moved from one computer to another with ease (assuming the original files are also copied/moved). There’s no need to find the target of the shortcut all over again on the second computer Ditto for linked files (e.g Word documents that link to data in a separate Excel file), playlists, and any files that reference the exact file locations of other files. This applies even to the drive that your files are stored on. If your files are stored on C: on one computer, make sure they’re stored on C: on all your computers. Otherwise all your shortcuts, playlists and linked files will stop working! Tip #13. Create an “Inbox” Folder Create yourself a folder where you store all files that you’re currently working on, or that you haven’t gotten around to filing yet. You can think of this folder as your “to-do” list. You can call it “Inbox” (making it the same metaphor as your email system), or “Work”, or “To-Do”, or “Scratch”, or whatever name makes sense to you. It doesn’t matter what you call it – just make sure you have one! Once you have finished working on a file, you then move it from the “Inbox” to its correct location within your organizational structure. You may want to use your Desktop as this “Inbox” folder. Rightly or wrongly, most people do. It’s not a bad place to put such files, but be careful: If you do decide that your Desktop represents your “to-do” list, then make sure that no other files find their way there. In other words, make sure that your “Inbox”, wherever it is, Desktop or otherwise, is kept free of junk – stray files that don’t belong there. So where should you put this folder, which, almost by definition, lives outside the structure of the rest of your filing system? Well, first and foremost, it has to be somewhere handy. This will be one of your most-visited folders, so convenience is key. Putting it on the Desktop is a great option – especially if you don’t have any other folders on your Desktop: the folder then becomes supremely easy to find in Windows Explorer: You would then create shortcuts to this folder in convenient spots all over your computer (“Favorite Links”, “Quick Launch”, etc). Tip #14. Ensure You have Only One “Inbox” Folder Once you’ve created your “Inbox” folder, don’t use any other folder location as your “to-do list”. Throw every incoming or created file into the Inbox folder as you create/receive it. This keeps the rest of your computer pristine and free of randomly created or downloaded junk. The last thing you want to be doing is checking multiple folders to see all your current tasks and projects. Gather them all together into one folder. Here are some tips to help ensure you only have one Inbox: Set the default “save” location of all your programs to this folder. Set the default “download” location for your browser to this folder. If this folder is not your desktop (recommended) then also see if you can make a point of not putting “to-do” files on your desktop. This keeps your desktop uncluttered and Zen-like: (the Inbox folder is in the bottom-right corner) Tip #15. Be Vigilant about Clearing Your “Inbox” Folder This is one of the keys to staying organized. If you let your “Inbox” overflow (i.e. allow there to be more than, say, 30 files or folders in there), then you’re probably going to start feeling like you’re overwhelmed: You’re not keeping up with your to-do list. Once your Inbox gets beyond a certain point (around 30 files, studies have shown), then you’ll simply start to avoid it. You may continue to put files in there, but you’ll be scared to look at it, fearing the “out of control” feeling that all overworked, chaotic or just plain disorganized people regularly feel. So, here’s what you can do: Visit your Inbox/to-do folder regularly (at least five times per day). Scan the folder regularly for files that you have completed working on and are ready for filing. File them immediately. Make it a source of pride to keep the number of files in this folder as small as possible. If you value peace of mind, then make the emptiness of this folder one of your highest (computer) priorities If you know that a particular file has been in the folder for more than, say, six weeks, then admit that you’re not actually going to get around to processing it, and move it to its final resting place. Tip #16. File Everything Immediately, and Use Shortcuts for Your Active Projects As soon as you create, receive or download a new file, store it away in its “correct” folder immediately. Then, whenever you need to work on it (possibly straight away), create a shortcut to it in your “Inbox” (“to-do”) folder or your desktop. That way, all your files are always in their “correct” locations, yet you still have immediate, convenient access to your current, active files. When you finish working on a file, simply delete the shortcut. Ideally, your “Inbox” folder – and your Desktop – should contain no actual files or folders. They should simply contain shortcuts. Tip #17. Use Directory Symbolic Links (or Junctions) to Maintain One Unified Folder Structure Using this tip, we can get around a potential hiccup that we can run into when creating our organizational structure – the issue of having more than one drive on our computer (C:, D:, etc). We might have files we need to store on the D: drive for space reasons, and yet want to base our organized folder structure on the C: drive (or vice-versa). Your chosen organizational structure may dictate that all your files must be accessed from the C: drive (for example, the root folder of all your files may be something like C:\Files). And yet you may still have a D: drive and wish to take advantage of the hundreds of spare Gigabytes that it offers. Did you know that it’s actually possible to store your files on the D: drive and yet access them as if they were on the C: drive? And no, we’re not talking about shortcuts here (although the concept is very similar). By using the shell command mklink, you can essentially take a folder that lives on one drive and create an alias for it on a different drive (you can do lots more than that with mklink – for a full rundown on this programs capabilities, see our dedicated article). These aliases are called directory symbolic links (and used to be known as junctions). You can think of them as “virtual” folders. They function exactly like regular folders, except they’re physically located somewhere else. For example, you may decide that your entire D: drive contains your complete organizational file structure, but that you need to reference all those files as if they were on the C: drive, under C:\Files. If that was the case you could create C:\Files as a directory symbolic link – a link to D:, as follows: mklink /d c:\files d:\ Or it may be that the only files you wish to store on the D: drive are your movie collection. You could locate all your movie files in the root of your D: drive, and then link it to C:\Files\Media\Movies, as follows: mklink /d c:\files\media\movies d:\ (Needless to say, you must run these commands from a command prompt – click the Start button, type cmd and press Enter) Tip #18. Customize Your Folder Icons This is not strictly speaking an organizational tip, but having unique icons for each folder does allow you to more quickly visually identify which folder is which, and thus saves you time when you’re finding files. An example is below (from my folder that contains all files downloaded from the Internet): To learn how to change your folder icons, please refer to our dedicated article on the subject. Tip #19. Tidy Your Start Menu The Windows Start Menu is usually one of the messiest parts of any Windows computer. Every program you install seems to adopt a completely different approach to placing icons in this menu. Some simply put a single program icon. Others create a folder based on the name of the software. And others create a folder based on the name of the software manufacturer. It’s chaos, and can make it hard to find the software you want to run. Thankfully we can avoid this chaos with useful operating system features like Quick Launch, the Superbar or pinned start menu items. Even so, it would make a lot of sense to get into the guts of the Start Menu itself and give it a good once-over. All you really need to decide is how you’re going to organize your applications. A structure based on the purpose of the application is an obvious candidate. Below is an example of one such structure: In this structure, Utilities means software whose job it is to keep the computer itself running smoothly (configuration tools, backup software, Zip programs, etc). Applications refers to any productivity software that doesn’t fit under the headings Multimedia, Graphics, Internet, etc. In case you’re not aware, every icon in your Start Menu is a shortcut and can be manipulated like any other shortcut (copied, moved, deleted, etc). With the Windows Start Menu (all version of Windows), Microsoft has decided that there be two parallel folder structures to store your Start Menu shortcuts. One for you (the logged-in user of the computer) and one for all users of the computer. Having two parallel structures can often be redundant: If you are the only user of the computer, then having two parallel structures is totally redundant. Even if you have several users that regularly log into the computer, most of your installed software will need to be made available to all users, and should thus be moved out of the “just you” version of the Start Menu and into the “all users” area. To take control of your Start Menu, so you can start organizing it, you’ll need to know how to access the actual folders and shortcut files that make up the Start Menu (both versions of it). To find these folders and files, click the Start button and then right-click on the All Programs text (Windows XP users should right-click on the Start button itself): The Open option refers to the “just you” version of the Start Menu, while the Open All Users option refers to the “all users” version. Click on the one you want to organize. A Windows Explorer window then opens with your chosen version of the Start Menu selected. From there it’s easy. Double-click on the Programs folder and you’ll see all your folders and shortcuts. Now you can delete/rename/move until it’s just the way you want it. Note: When you’re reorganizing your Start Menu, you may want to have two Explorer windows open at the same time – one showing the “just you” version and one showing the “all users” version. You can drag-and-drop between the windows. Tip #20. Keep Your Start Menu Tidy Once you have a perfectly organized Start Menu, try to be a little vigilant about keeping it that way. Every time you install a new piece of software, the icons that get created will almost certainly violate your organizational structure. So to keep your Start Menu pristine and organized, make sure you do the following whenever you install a new piece of software: Check whether the software was installed into the “just you” area of the Start Menu, or the “all users” area, and then move it to the correct area. Remove all the unnecessary icons (like the “Read me” icon, the “Help” icon (you can always open the help from within the software itself when it’s running), the “Uninstall” icon, the link(s)to the manufacturer’s website, etc) Rename the main icon(s) of the software to something brief that makes sense to you. For example, you might like to rename Microsoft Office Word 2010 to simply Word Move the icon(s) into the correct folder based on your Start Menu organizational structure And don’t forget: when you uninstall a piece of software, the software’s uninstall routine is no longer going to be able to remove the software’s icon from the Start Menu (because you moved and/or renamed it), so you’ll need to remove that icon manually. Tip #21. Tidy C:\ The root of your C: drive (C:\) is a common dumping ground for files and folders – both by the users of your computer and by the software that you install on your computer. It can become a mess. There’s almost no software these days that requires itself to be installed in C:\. 99% of the time it can and should be installed into C:\Program Files. And as for your own files, well, it’s clear that they can (and almost always should) be stored somewhere else. In an ideal world, your C:\ folder should look like this (on Windows 7): Note that there are some system files and folders in C:\ that are usually and deliberately “hidden” (such as the Windows virtual memory file pagefile.sys, the boot loader file bootmgr, and the System Volume Information folder). Hiding these files and folders is a good idea, as they need to stay where they are and are almost never needed to be opened or even seen by you, the user. Hiding them prevents you from accidentally messing with them, and enhances your sense of order and well-being when you look at your C: drive folder. Tip #22. Tidy Your Desktop The Desktop is probably the most abused part of a Windows computer (from an organization point of view). It usually serves as a dumping ground for all incoming files, as well as holding icons to oft-used applications, plus some regularly opened files and folders. It often ends up becoming an uncontrolled mess. See if you can avoid this. Here’s why… Application icons (Word, Internet Explorer, etc) are often found on the Desktop, but it’s unlikely that this is the optimum place for them. The “Quick Launch” bar (or the Superbar in Windows 7) is always visible and so represents a perfect location to put your icons. You’ll only be able to see the icons on your Desktop when all your programs are minimized. It might be time to get your application icons off your desktop… You may have decided that the Inbox/To-do folder on your computer (see tip #13, above) should be your Desktop. If so, then enough said. Simply be vigilant about clearing it and preventing it from being polluted by junk files (see tip #15, above). On the other hand, if your Desktop is not acting as your “Inbox” folder, then there’s no reason for it to have any data files or folders on it at all, except perhaps a couple of shortcuts to often-opened files and folders (either ongoing or current projects). Everything else should be moved to your “Inbox” folder. In an ideal world, it might look like this: Tip #23. Move Permanent Items on Your Desktop Away from the Top-Left Corner When files/folders are dragged onto your desktop in a Windows Explorer window, or when shortcuts are created on your Desktop from Internet Explorer, those icons are always placed in the top-left corner – or as close as they can get. If you have other files, folders or shortcuts that you keep on the Desktop permanently, then it’s a good idea to separate these permanent icons from the transient ones, so that you can quickly identify which ones the transients are. An easy way to do this is to move all your permanent icons to the right-hand side of your Desktop. That should keep them separated from incoming items. Tip #24. Synchronize If you have more than one computer, you’ll almost certainly want to share files between them. If the computers are permanently attached to the same local network, then there’s no need to store multiple copies of any one file or folder – shortcuts will suffice. However, if the computers are not always on the same network, then you will at some point need to copy files between them. For files that need to permanently live on both computers, the ideal way to do this is to synchronize the files, as opposed to simply copying them. We only have room here to write a brief summary of synchronization, not a full article. In short, there are several different types of synchronization: Where the contents of one folder are accessible anywhere, such as with Dropbox Where the contents of any number of folders are accessible anywhere, such as with Windows Live Mesh Where any files or folders from anywhere on your computer are synchronized with exactly one other computer, such as with the Windows “Briefcase”, Microsoft SyncToy, or (much more powerful, yet still free) SyncBack from 2BrightSparks. This only works when both computers are on the same local network, at least temporarily. A great advantage of synchronization solutions is that once you’ve got it configured the way you want it, then the sync process happens automatically, every time. Click a button (or schedule it to happen automatically) and all your files are automagically put where they’re supposed to be. If you maintain the same file and folder structure on both computers, then you can also sync files depend upon the correct location of other files, like shortcuts, playlists and office documents that link to other office documents, and the synchronized files still work on the other computer! Tip #25. Hide Files You Never Need to See If you have your files well organized, you will often be able to tell if a file is out of place just by glancing at the contents of a folder (for example, it should be pretty obvious if you look in a folder that contains all the MP3s from one music CD and see a Word document in there). This is a good thing – it allows you to determine if there are files out of place with a quick glance. Yet sometimes there are files in a folder that seem out of place but actually need to be there, such as the “folder art” JPEGs in music folders, and various files in the root of the C: drive. If such files never need to be opened by you, then a good idea is to simply hide them. Then, the next time you glance at the folder, you won’t have to remember whether that file was supposed to be there or not, because you won’t see it at all! To hide a file, simply right-click on it and choose Properties: Then simply tick the Hidden tick-box: Tip #26. Keep Every Setup File These days most software is downloaded from the Internet. Whenever you download a piece of software, keep it. You’ll never know when you need to reinstall the software. Further, keep with it an Internet shortcut that links back to the website where you originally downloaded it, in case you ever need to check for updates. See tip #33 below for a full description of the excellence of organizing your setup files. Tip #27. Try to Minimize the Number of Folders that Contain Both Files and Sub-folders Some of the folders in your organizational structure will contain only files. Others will contain only sub-folders. And you will also have some folders that contain both files and sub-folders. You will notice slight improvements in how long it takes you to locate a file if you try to avoid this third type of folder. It’s not always possible, of course – you’ll always have some of these folders, but see if you can avoid it. One way of doing this is to take all the leftover files that didn’t end up getting stored in a sub-folder and create a special “Miscellaneous” or “Other” folder for them. Tip #28. Starting a Filename with an Underscore Brings it to the Top of a List Further to the previous tip, if you name that “Miscellaneous” or “Other” folder in such a way that its name begins with an underscore “_”, then it will appear at the top of the list of files/folders. The screenshot below is an example of this. Each folder in the list contains a set of digital photos. The folder at the top of the list, _Misc, contains random photos that didn’t deserve their own dedicated folder: Tip #29. Clean Up those CD-ROMs and (shudder!) Floppy Disks Have you got a pile of CD-ROMs stacked on a shelf of your office? Old photos, or files you archived off onto CD-ROM (or even worse, floppy disks!) because you didn’t have enough disk space at the time? In the meantime have you upgraded your computer and now have 500 Gigabytes of space you don’t know what to do with? If so, isn’t it time you tidied up that stack of disks and filed them into your gorgeous new folder structure? So what are you waiting for? Bite the bullet, copy them all back onto your computer, file them in their appropriate folders, and then back the whole lot up onto a shiny new 1000Gig external hard drive! Useful Folders to Create This next section suggests some useful folders that you might want to create within your folder structure. I’ve personally found them to be indispensable. The first three are all about convenience – handy folders to create and then put somewhere that you can always access instantly. For each one, it’s not so important where the actual folder is located, but it’s very important where you put the shortcut(s) to the folder. You might want to locate the shortcuts: On your Desktop In your “Quick Launch” area (or pinned to your Windows 7 Superbar) In your Windows Explorer “Favorite Links” area Tip #30. Create an “Inbox” (“To-Do”) Folder This has already been mentioned in depth (see tip #13), but we wanted to reiterate its importance here. This folder contains all the recently created, received or downloaded files that you have not yet had a chance to file away properly, and it also may contain files that you have yet to process. In effect, it becomes a sort of “to-do list”. It doesn’t have to be called “Inbox” – you can call it whatever you want. Tip #31. Create a Folder where Your Current Projects are Collected Rather than going hunting for them all the time, or dumping them all on your desktop, create a special folder where you put links (or work folders) for each of the projects you’re currently working on. You can locate this folder in your “Inbox” folder, on your desktop, or anywhere at all – just so long as there’s a way of getting to it quickly, such as putting a link to it in Windows Explorer’s “Favorite Links” area: Tip #32. Create a Folder for Files and Folders that You Regularly Open You will always have a few files that you open regularly, whether it be a spreadsheet of your current accounts, or a favorite playlist. These are not necessarily “current projects”, rather they’re simply files that you always find yourself opening. Typically such files would be located on your desktop (or even better, shortcuts to those files). Why not collect all such shortcuts together and put them in their own special folder? As with the “Current Projects” folder (above), you would want to locate that folder somewhere convenient. Below is an example of a folder called “Quick links”, with about seven files (shortcuts) in it, that is accessible through the Windows Quick Launch bar: See tip #37 below for a full explanation of the power of the Quick Launch bar. Tip #33. Create a “Set-ups” Folder A typical computer has dozens of applications installed on it. For each piece of software, there are often many different pieces of information you need to keep track of, including: The original installation setup file(s). This can be anything from a simple 100Kb setup.exe file you downloaded from a website, all the way up to a 4Gig ISO file that you copied from a DVD-ROM that you purchased. The home page of the software manufacturer (in case you need to look up something on their support pages, their forum or their online help) The page containing the download link for your actual file (in case you need to re-download it, or download an upgraded version) The serial number Your proof-of-purchase documentation Any other template files, plug-ins, themes, etc that also need to get installed For each piece of software, it’s a great idea to gather all of these files together and put them in a single folder. The folder can be the name of the software (plus possibly a very brief description of what it’s for – in case you can’t remember what the software does based in its name). Then you would gather all of these folders together into one place, and call it something like “Software” or “Setups”. If you have enough of these folders (I have several hundred, being a geek, collected over 20 years), then you may want to further categorize them. My own categorization structure is based on “platform” (operating system): The last seven folders each represents one platform/operating system, while _Operating Systems contains set-up files for installing the operating systems themselves. _Hardware contains ROMs for hardware I own, such as routers. Within the Windows folder (above), you can see the beginnings of the vast library of software I’ve compiled over the years: An example of a typical application folder looks like this: Tip #34. Have a “Settings” Folder We all know that our documents are important. So are our photos and music files. We save all of these files into folders, and then locate them afterwards and double-click on them to open them. But there are many files that are important to us that can’t be saved into folders, and then searched for and double-clicked later on. These files certainly contain important information that we need, but are often created internally by an application, and saved wherever that application feels is appropriate. A good example of this is the “PST” file that Outlook creates for us and uses to store all our emails, contacts, appointments and so forth. Another example would be the collection of Bookmarks that Firefox stores on your behalf. And yet another example would be the customized settings and configuration files of our all our software. Granted, most Windows programs store their configuration in the Registry, but there are still many programs that use configuration files to store their settings. Imagine if you lost all of the above files! And yet, when people are backing up their computers, they typically only back up the files they know about – those that are stored in the “My Documents” folder, etc. If they had a hard disk failure or their computer was lost or stolen, their backup files would not include some of the most vital files they owned. Also, when migrating to a new computer, it’s vital to ensure that these files make the journey. It can be a very useful idea to create yourself a folder to store all your “settings” – files that are important to you but which you never actually search for by name and double-click on to open them. Otherwise, next time you go to set up a new computer just the way you want it, you’ll need to spend hours recreating the configuration of your previous computer! So how to we get our important files into this folder? Well, we have a few options: Some programs (such as Outlook and its PST files) allow you to place these files wherever you want. If you delve into the program’s options, you will find a setting somewhere that controls the location of the important settings files (or “personal storage” – PST – when it comes to Outlook) Some programs do not allow you to change such locations in any easy way, but if you get into the Registry, you can sometimes find a registry key that refers to the location of the file(s). Simply move the file into your Settings folder and adjust the registry key to refer to the new location. Some programs stubbornly refuse to allow their settings files to be placed anywhere other then where they stipulate. When faced with programs like these, you have three choices: (1) You can ignore those files, (2) You can copy the files into your Settings folder (let’s face it – settings don’t change very often), or (3) you can use synchronization software, such as the Windows Briefcase, to make synchronized copies of all your files in your Settings folder. All you then have to do is to remember to run your sync software periodically (perhaps just before you run your backup software!). There are some other things you may decide to locate inside this new “Settings” folder: Exports of registry keys (from the many applications that store their configurations in the Registry). This is useful for backup purposes or for migrating to a new computer Notes you’ve made about all the specific customizations you have made to a particular piece of software (so that you’ll know how to do it all again on your next computer) Shortcuts to webpages that detail how to tweak certain aspects of your operating system or applications so they are just the way you like them (such as how to remove the words “Shortcut to” from the beginning of newly created shortcuts). In other words, you’d want to create shortcuts to half the pages on the How-To Geek website! Here’s an example of a “Settings” folder: Windows Features that Help with Organization This section details some of the features of Microsoft Windows that are a boon to anyone hoping to stay optimally organized. Tip #35. Use the “Favorite Links” Area to Access Oft-Used Folders Once you’ve created your great new filing system, work out which folders you access most regularly, or which serve as great starting points for locating the rest of the files in your folder structure, and then put links to those folders in your “Favorite Links” area of the left-hand side of the Windows Explorer window (simply called “Favorites” in Windows 7): Some ideas for folders you might want to add there include: Your “Inbox” folder (or whatever you’ve called it) – most important! The base of your filing structure (e.g. C:\Files) A folder containing shortcuts to often-accessed folders on other computers around the network (shown above as Network Folders) A folder containing shortcuts to your current projects (unless that folder is in your “Inbox” folder) Getting folders into this area is very simple – just locate the folder you’re interested in and drag it there! Tip #36. Customize the Places Bar in the File/Open and File/Save Boxes Consider the screenshot below: The highlighted icons (collectively known as the “Places Bar”) can be customized to refer to any folder location you want, allowing instant access to any part of your organizational structure. Note: These File/Open and File/Save boxes have been superseded by new versions that use the Windows Vista/Windows 7 “Favorite Links”, but the older versions (shown above) are still used by a surprisingly large number of applications. The easiest way to customize these icons is to use the Group Policy Editor, but not everyone has access to this program. If you do, open it up and navigate to: User Configuration > Administrative Templates > Windows Components > Windows Explorer > Common Open File Dialog If you don’t have access to the Group Policy Editor, then you’ll need to get into the Registry. Navigate to: HKEY_CURRENT_USER \ Software \ Microsoft \ Windows \ CurrentVersion \ Policies \ comdlg32 \ Placesbar It should then be easy to make the desired changes. Log off and log on again to allow the changes to take effect. Tip #37. Use the Quick Launch Bar as a Application and File Launcher That Quick Launch bar (to the right of the Start button) is a lot more useful than people give it credit for. Most people simply have half a dozen icons in it, and use it to start just those programs. But it can actually be used to instantly access just about anything in your filing system: For complete instructions on how to set this up, visit our dedicated article on this topic. Tip #38. Put a Shortcut to Windows Explorer into Your Quick Launch Bar This is only necessary in Windows Vista and Windows XP. The Microsoft boffins finally got wise and added it to the Windows 7 Superbar by default. Windows Explorer – the program used for managing your files and folders – is one of the most useful programs in Windows. Anyone who considers themselves serious about being organized needs instant access to this program at any time. A great place to create a shortcut to this program is in the Windows XP and Windows Vista “Quick Launch” bar: To get it there, locate it in your Start Menu (usually under “Accessories”) and then right-drag it down into your Quick Launch bar (and create a copy). Tip #39. Customize the Starting Folder for Your Windows 7 Explorer Superbar Icon If you’re on Windows 7, your Superbar will include a Windows Explorer icon. Clicking on the icon will launch Windows Explorer (of course), and will start you off in your “Libraries” folder. Libraries may be fine as a starting point, but if you have created yourself an “Inbox” folder, then it would probably make more sense to start off in this folder every time you launch Windows Explorer. To change this default/starting folder location, then first right-click the Explorer icon in the Superbar, and then right-click Properties:Then, in Target field of the Windows Explorer Properties box that appears, type %windir%\explorer.exe followed by the path of the folder you wish to start in. For example: %windir%\explorer.exe C:\Files If that folder happened to be on the Desktop (and called, say, “Inbox”), then you would use the following cleverness: %windir%\explorer.exe shell:desktop\Inbox Then click OK and test it out. Tip #40. Ummmmm…. No, that’s it. I can’t think of another one. That’s all of the tips I can come up with. I only created this one because 40 is such a nice round number… Case Study – An Organized PC To finish off the article, I have included a few screenshots of my (main) computer (running Vista). The aim here is twofold: To give you a sense of what it looks like when the above, sometimes abstract, tips are applied to a real-life computer, and To offer some ideas about folders and structure that you may want to steal to use on your own PC. Let’s start with the C: drive itself. Very minimal. All my files are contained within C:\Files. I’ll confine the rest of the case study to this folder: That folder contains the following: Mark: My personal files VC: My business (Virtual Creations, Australia) Others contains files created by friends and family Data contains files from the rest of the world (can be thought of as “public” files, usually downloaded from the Net) Settings is described above in tip #34 The Data folder contains the following sub-folders: Audio: Radio plays, audio books, podcasts, etc Development: Programmer and developer resources, sample source code, etc (see below) Humour: Jokes, funnies (those emails that we all receive) Movies: Downloaded and ripped movies (all legal, of course!), their scripts, DVD covers, etc. Music: (see below) Setups: Installation files for software (explained in full in tip #33) System: (see below) TV: Downloaded TV shows Writings: Books, instruction manuals, etc (see below) The Music folder contains the following sub-folders: Album covers: JPEG scans Guitar tabs: Text files of guitar sheet music Lists: e.g. “Top 1000 songs of all time” Lyrics: Text files MIDI: Electronic music files MP3 (representing 99% of the Music folder): MP3s, either ripped from CDs or downloaded, sorted by artist/album name Music Video: Video clips Sheet Music: usually PDFs The Data\Writings folder contains the following sub-folders: (all pretty self-explanatory) The Data\Development folder contains the following sub-folders: Again, all pretty self-explanatory (if you’re a geek) The Data\System folder contains the following sub-folders: These are usually themes, plug-ins and other downloadable program-specific resources. The Mark folder contains the following sub-folders: From Others: Usually letters that other people (friends, family, etc) have written to me For Others: Letters and other things I have created for other people Green Book: None of your business Playlists: M3U files that I have compiled of my favorite songs (plus one M3U playlist file for every album I own) Writing: Fiction, philosophy and other musings of mine Mark Docs: Shortcut to C:\Users\Mark Settings: Shortcut to C:\Files\Settings\Mark The Others folder contains the following sub-folders: The VC (Virtual Creations, my business – I develop websites) folder contains the following sub-folders: And again, all of those are pretty self-explanatory. Conclusion These tips have saved my sanity and helped keep me a productive geek, but what about you? What tips and tricks do you have to keep your files organized? Please share them with us in the comments. Come on, don’t be shy… Similar Articles Productive Geek Tips Fix For When Windows Explorer in Vista Stops Showing File NamesWhy Did Windows Vista’s Music Folder Icon Turn Yellow?Print or Create a Text File List of the Contents in a Directory the Easy WayCustomize the Windows 7 or Vista Send To MenuAdd Copy To / Move To on Windows 7 or Vista Right-Click Menu TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Acronis Online Backup DVDFab 6 Revo Uninstaller Pro Registry Mechanic 9 for Windows Track Daily Goals With 42Goals Video Toolbox is a Superb Online Video Editor Fun with 47 charts and graphs Tomorrow is Mother’s Day Check the Average Speed of YouTube Videos You’ve Watched OutlookStatView Scans and Displays General Usage Statistics

Read the article
Lightning talk: Coderetreat

- by Michael Williamson

In the spirit of trying to encourage more deliberate practice amongst coders in Red Gate, Lauri Pesonen had the idea of running a coderetreat in Red Gate. Lauri and I ran the first one a few weeks ago: given that neither of us hadn’t even been to a coderetreat before, let alone run one, I think it turned out quite well. The participants gave positive feedback, saying that they enjoyed the day, wrote some thought-provoking code and would do it again. Sam Blackburn was one of the attendees, and gave a lightning talk to the other developers in one of our regular lightning talk sessions: In case you can’t watch the video, I’ve transcribed the talk below, although I’d recommend watching the video if you can — I didn’t have much time to do the transcribing! So, what is a coderetreat? So it’s not just something in Red Gate, there’s a website and everything, although it’s not a very big website. It calls itself a community network. The basic ideas behind coderetreat are: you’ve got one day, and you split it into one hour sections. You spend three quarters of that coding, and do a little retrospective at the end. You’re supposed to start fresh each, we were told to delete our code after every session. We were in pairs, swapping after each session, and we did the same task every time. In fact, Conway’s Game of Life is the only task mentioned anywhere that I find for coderetreat. So I don’t know what we’ll do next time, or if we’re meant to do the same thing again. There are some guiding principles which felt to us like restrictions, that you have to code in crazy ways to encourage better code. Final thing is that it’s supposed to be free for outsiders to join. It’s meant to be a kind of networking thing, where you link up with people from other companies. We had a pilot day with Michael and Lauri. Since it was basically the first time any of us had done anything like this, everybody was from Red Gate. We didn’t chat to anybody else for the initial one. The task was Conway’s Game of Life, which most of you have probably heard of it, all but one of us knew about it when did the coderetreat. I won’t got into the details of what it is, but it felt like the right size of task, basically one or two groups actually produced something working by the end of the day, and of course that doesn’t mean it’s necessarily a day’s work to produce that because we were starting again every hour. The task really drives you more than trying to create good code, I found. It was really tempting to try and get it working rather than stick to the rules. But it’s really good to stop and try again because there are so many what-ifs when you’ve finished writing something, “what if I’d done it this way?”. You can answer all those questions at a coderetreat because it’s not about getting a product out the door, it’s about learning and playing with ideas. So we had all these different practices we were trying. I’ll try and go through most of these. Single responsibility is this idea that everything should do just one thing. It was the very first session, we were still trying to figure out how do you go about the Game of Life? So by the end of forty-five minutes hadn’t produced very much for that first session. We were still thinking, “Do we start with a board, how do we represent all these squares? It can be infinitely big, help, this is getting really difficult!”. So, most of us didn’t really get anywhere on the first one. Although it was interesting that some people started with the board, one group started with the FateDecider class that decides whether things live or die. A sort of god class, but in a good way. They managed to implement all of the rules without even defining how the squares were arranged or anything like that. Another thing we tried was TDD (test-driven development). I’m sure most of you know what TDD is: Watch a test, watch it fail for the right reason Write code to pass the test, watch it pass Refactor, check the test still passes Repeat! It basically worked, we were able to produce code, but we often found the tests defined the direction that code went, which is obviously the idea of TDD. But you tend to find that by the time you’ve even written your first assertion, which is supposed to be the very first thing you write, because you write your tests backwards from the assertions back to the initial conditions, you’ve already constrained the logic of the code in some way by the time you’ve done that. You then get to this situation of, “Well, we actually want to go in a slightly different direction. Can we do this?”. Can we write tests that don’t constrain the architecture? Wrapping up all primitives: it’s kind of turtles all the way down. We had a Size, which has a Width and Height, which both derive from Dimension. You’ve got pages of code before you’ve even done anything. No getters and setters (use tell don’t ask instead): mocks and stubs for tests are required if you want to assert that your results are what you think they should be. You can’t just check the internal state of the code. And people found that really challenging and it made them think in a different way which I think is really good. Not having mutable state: that was kind of confusing because we weren’t quite sure what fitted within that rule and what didn’t, and I think we were trying too hard to follow the rule rather than the guideline. No if-statements: supposed to use polymorphism instead, but polymorphism still requires a factory with conditional behaviour. We did something really crazy to get around this: public T If(bool condition, Func<T> left, Func<T> right) { var dict = new Dictionary<bool, Func<T>> {{true, left}, {false, right}}; return dict[condition].Invoke(); } That is not really polymorphism, is it? For-loops: you can always replace a for-loop with recursion, but it doesn’t tend to make it any more readable unless it’s the kind of task that really lends itself to that. So it was interesting, it was good practice, but it wouldn’t make it easier it’s the kind of tree-structure algorithm where that would help. Having a limit on the number of levels of indentation: again, I think it does produce very nice, clean code, but it wasn’t actually a challenge because you just extract methods. That’s quite a useful thing because you can apply that to real code and say, “Okay, should this method really be going crazy like this?” No talking: we hated that. It’s like there’s two of you at a computer, and one of you is doing the typing, what does the other guy do if they’re not allowed to talk. The answer is TDD ping-pong – one person writes the tests, and then the other person writes the code to pass the test. And that creates communication without actually having to have discussion about things which is kind of cool. No code comments: just makes no difference to anything. It’s a forty-five minute exercise, so what are you going to put comments in code for? Finally, this is my fault. I discovered an entertaining way of doing the calculation that was kind of cool (using convolutions over the state of the board). Unfortunately, it turns out to be really hard to implement in C#, so didn’t even manage to work out how to do that convolution in C#. It’s trivial in some high-level languages, but you need something matrix-orientated for it to really work. That’s most of it, really. The thoughts that people went away with: we put down our answers to questions like “What have you learnt?” and “What surprised you?”, “How are you going to do things differently?”, and most people said redoing the problem is really, really good for understanding it properly. People hate having a massive legacy codebase that they can’t change, so being able to attack something three different ways in an environment where the end-product isn’t important: that’s something people really enjoyed. Pair-programming: also people said that they wanted to do more of that, especially with TDD ping-pong, where you write the test and somebody else writes the code. Various people thought different things about immutables, but most people thought they were good, they promote functional programming. And TDD people found really hard. “Tell, don’t ask” people found really, really hard and really, really, really hard to do well. And the recursion just made things trickier to debug. But most people agreed that coderetreats are really cool, and we should do more of them.

Read the article
How to make software development decisions based on facts

- by Laila

We love to hear stories about the many and varied ways our customers use the tools that we develop, but in our earnest search for stories and feedback, we'd rather forgotten that some of our keenest users are fellow RedGaters, in the same building. It was almost by chance that we discovered how the SQL Source Control team were using SmartAssembly. As it happens, there is a separate account (here on Simple-Talk) of how SmartAssembly was used to support the Early Access program; by providing answers to specific questions about how the SQL Source Control product was used. But what really got us all grinning was how valuable the SQL Source Control team found the reports that SmartAssembly was quickly and painlessly providing. So gather round, my friends, and I'll tell you the Tale Of The Framework Upgrade . <strange mirage effect to denote a flashback. A subtle background string of music starts playing in minor key> Kevin and his team were undecided. They weren't sure whether they could move their software product from .NET 2 to .NET 3.5 , let alone to .NET 4. You see, they were faced with having to guess what version of .NET was already installed on the average user's machine, which I'm sure you'll agree is no easy task. Upgrading their code to .NET 3.5 might put a barrier to people trying the tool, which was the last thing Kevin wanted: "what if our users have to download X, Y, and Z before being able to open the application?" he asked. That fear of users having to do half an hour of downloads (.followed by at least ten minutes of installation. followed by a five minute restart) meant that Kevin's team couldn't take advantage of WCF (Windows Communication Foundation). This made them sad, because WCF would have allowed them to write their code in a much simpler way, and in hours instead of days (as was the case with .NET 2). Oh sure, they had a gut feeling that this probably wasn't the case, 3.5 had been out for so many years, but they weren't sure. <background music switches to major key> SmartAssembly Feature Usage Reporting gave Kevin and his team exactly what they needed: hard data on their users' systems, both hardware and software. I was there, I saw it happen, and that's not the sort of thing a woman quickly forgets. I'll always remember his last words (before he went to lunch): "You get lots of free information by just checking a box in SmartAssembly" is what he said. For example, they could see how many CPU cores their customers were using, and found out that they should be making use of parallelism to take advantage of available cores. But crucially, (and this is the moral of my tale, dear reader), Kevin saw that 99% of SQL Source Control's users were on .NET 3.5 or above. So he knew that they could make the switch and that is was safe to do so. With this reassurance, they could use WCF to not only make development easier, but to also give them a really nice way to do inter-process communication between the Source Control and the SQL Compare products. To have done that on .NET 2.0 was certainly possible <knowing chuckle>, but Microsoft have made it a lot easier with WCF. <strange mirage effect to denote end of flashback> So you see, with Feature Usage Reporting, they finally got the hard evidence they needed to safely make the switch to .NET 3.5, knowing it would not inconvenience their users. And that, my friends, is just the sort of thing we like to hear.

Read the article
Inside the Concurrent Collections: ConcurrentBag

- by Simon Cooper

Unlike the other concurrent collections, ConcurrentBag does not really have a non-concurrent analogy. As stated in the MSDN documentation, ConcurrentBag is optimised for the situation where the same thread is both producing and consuming items from the collection. We'll see how this is the case as we take a closer look. Again, I recommend you have ConcurrentBag open in a decompiler for reference. Thread Statics ConcurrentBag makes heavy use of thread statics - static variables marked with ThreadStaticAttribute. This is a special attribute that instructs the CLR to scope any values assigned to or read from the variable to the executing thread, not globally within the AppDomain. This means that if two different threads assign two different values to the same thread static variable, one value will not overwrite the other, and each thread will see the value they assigned to the variable, separately to any other thread. This is a very useful function that allows for ConcurrentBag's concurrency properties. You can think of a thread static variable: [ThreadStatic] private static int m_Value; as doing the same as: private static Dictionary<Thread, int> m_Values; where the executing thread's identity is used to automatically set and retrieve the corresponding value in the dictionary. In .NET 4, this usage of ThreadStaticAttribute is encapsulated in the ThreadLocal class. Lists of lists ConcurrentBag, at its core, operates as a linked list of linked lists: Each outer list node is an instance of ThreadLocalList, and each inner list node is an instance of Node. Each outer ThreadLocalList is owned by a particular thread, accessible through the thread local m_locals variable: private ThreadLocal<ThreadLocalList<T>> m_locals It is important to note that, although the m_locals variable is thread-local, that only applies to accesses through that variable. The objects referenced by the thread (each instance of the ThreadLocalList object) are normal heap objects that are not specific to any thread. Thinking back to the Dictionary analogy above, if each value stored in the dictionary could be accessed by other means, then any thread could access the value belonging to other threads using that mechanism. Only reads and writes to the variable defined as thread-local are re-routed by the CLR according to the executing thread's identity. So, although m_locals is defined as thread-local, the m_headList, m_nextList and m_tailList variables aren't. This means that any thread can access all the thread local lists in the collection by doing a linear search through the outer linked list defined by these variables. Adding items So, onto the collection operations. First, adding items. This one's pretty simple. If the current thread doesn't already own an instance of ThreadLocalList, then one is created (or, if there are lists owned by threads that have stopped, it takes control of one of those). Then the item is added to the head of that thread's list. That's it. Don't worry, it'll get more complicated when we account for the other operations on the list! Taking & Peeking items This is where it gets tricky. If the current thread's list has items in it, then it peeks or removes the head item (not the tail item) from the local list and returns that. However, if the local list is empty, it has to go and steal another item from another list, belonging to a different thread. It iterates through all the thread local lists in the collection using the m_headList and m_nextList variables until it finds one that has items in it, and it steals one item from that list. Up to this point, the two threads had been operating completely independently. To steal an item from another thread's list, the stealing thread has to do it in such a way as to not step on the owning thread's toes. Recall how adding and removing items both operate on the head of the thread's linked list? That gives us an easy way out - a thread trying to steal items from another thread can pop in round the back of another thread's list using the m_tail variable, and steal an item from the back without the owning thread knowing anything about it. The owning thread can carry on completely independently, unaware that one of its items has been nicked. However, this only works when there are at least 3 items in the list, as that guarantees there will be at least one node between the owning thread performing operations on the list head and the thread stealing items from the tail - there's no chance of the two threads operating on the same node at the same time and causing a race condition. If there's less than three items in the list, then there does need to be some synchronization between the two threads. In this case, the lock on the ThreadLocalList object is used to mediate access to a thread's list when there's the possibility of contention. Thread synchronization In ConcurrentBag, this is done using several mechanisms: Operations performed by the owner thread only take out the lock when there are less than three items in the collection. With three or greater items, there won't be any conflict with a stealing thread operating on the tail of the list. If a lock isn't taken out, the owning thread sets the list's m_currentOp variable to a non-zero value for the duration of the operation. This indicates to all other threads that there is a non-locked operation currently occuring on that list. The stealing thread always takes out the lock, to prevent two threads trying to steal from the same list at the same time. After taking out the lock, the stealing thread spinwaits until m_currentOp has been set to zero before actually performing the steal. This ensures there won't be a conflict with the owning thread when the number of items in the list is on the 2-3 item borderline. If any add or remove operations are started in the meantime, and the list is below 3 items, those operations try to take out the list's lock and are blocked until the stealing thread has finished. This allows a thread to steal an item from another thread's list without corrupting it. What about synchronization in the collection as a whole? Collection synchronization Any thread that operates on the collection's global structure (accessing anything outside the thread local lists) has to take out the collection's global lock - m_globalListsLock. This single lock is sufficient when adding a new thread local list, as the items inside each thread's list are unaffected. However, what about operations (such as Count or ToArray) that need to access every item in the collection? In order to ensure a consistent view, all operations on the collection are stopped while the count or ToArray is performed. This is done by freezing the bag at the start, performing the global operation, and unfreezing at the end: The global lock is taken out, to prevent structural alterations to the collection. m_needSync is set to true. This notifies all the threads that they need to take out their list's lock irregardless of what operation they're doing. All the list locks are taken out in order. This blocks all locking operations on the lists. The freezing thread waits for all current lockless operations to finish by spinwaiting on each m_currentOp field. The global operation can then be performed while the bag is frozen, but no other operations can take place at the same time, as all other threads are blocked on a list's lock. Then, once the global operation has finished, the locks are released, m_needSync is unset, and normal concurrent operation resumes. Concurrent principles That's the essence of how ConcurrentBag operates. Each thread operates independently on its own local list, except when they have to steal items from another list. When stealing, only the stealing thread is forced to take out the lock; the owning thread only has to when there is the possibility of contention. And a global lock controls accesses to the structure of the collection outside the thread lists. Operations affecting the entire collection take out all locks in the collection to freeze the contents at a single point in time. So, what principles can we extract here? Threads operate independently Thread-static variables and ThreadLocal makes this easy. Threads operate entirely concurrently on their own structures; only when they need to grab data from another thread is there any thread contention. Minimised lock-taking Even when two threads need to operate on the same data structures (one thread stealing from another), they do so in such a way such that the probability of actually blocking on a lock is minimised; the owning thread always operates on the head of the list, and the stealing thread always operates on the tail. Management of lockless operations Any operations that don't take out a lock still have a 'hook' to force them to lock when necessary. This allows all operations on the collection to be stopped temporarily while a global snapshot is taken. Hopefully, such operations will be short-lived and infrequent. That's all the concurrent collections covered. I hope you've found it as informative and interesting as I have. Next, I'll be taking a closer look at ThreadLocal, which I came across while analyzing ConcurrentBag. As you'll see, the operation of this class deserves a much closer look.

Read the article
What’s the use of code reuse?

- by Tony Davis

All great developers write reusable code, don’t they? Well, maybe, but as with all statements regarding what “great” developers do or don’t do, it’s probably an over-simplification. A novice programmer, in particular, will encounter in the literature a general assumption of the importance of code reusability. They spend time worrying about DRY (don’t repeat yourself), moving logic into specific “helper” modules that they can then reuse, agonizing about the minutiae of the class structure, inheritance and interface design that will promote easy reuse. Unfortunately, writing code specifically for reuse often leads to complicated object hierarchies and inheritance models that are anything but reusable. If, instead, one strives to write simple code units that are highly maintainable and perform a single function, in a concise, isolated fashion then the potential for reuse simply “drops out” as a natural by-product. Programmers, of course, care about these principles, about encapsulation and clean interfaces that don’t expose inner workings and allow easy pluggability. This is great when it helps with the maintenance and development of code but how often, in practice, do we actually reuse our code? Most DBAs and database developers are familiar with the practical reasons for the limited opportunities to reuse database code and its potential downsides. However, surely elsewhere in our code base, reuse happens often. After all, we can all name examples, such as date/time handling modules, which if we write with enough care we can plug in to many places. I spoke to a developer just yesterday who looked me in the eye and told me that in 30+ years as a developer (a successful one, I’d add), he’d never once reused his own code. As I sat blinking in disbelief, he explained that, of course, he always thought he would reuse it. He’d often agonized over its design, certain that he was creating code of great significance that he and other generations would reuse, with grateful tears misting their eyes. In fact, it never happened. He had in his head, most of the algorithms he needed and would simply write the code from scratch each time, refining the algorithms and tailoring the code to meet the specific requirements. It was, he said, simply quicker to do that than dig out the old code, check it, correct the mistakes, and adapt it. Is this a common experience, or just a strange anomaly? Viewed in a certain light, building code with a focus on reusability seems to hark to a past age where people built cars and music systems with the idea that someone else could and would replace and reuse the parts. Technology advances so rapidly that the next time you need the “same” code, it’s likely a new technique, or a whole new language, has emerged in the meantime, better equipped to tackle the task. Maybe we should be less fearful of the idea that we could write code well suited to the system requirements, but with little regard for reuse potential, and then rewrite a better version from scratch the next time.

Read the article
A temporary disagreement

- by Tony Davis

Last month, Phil Factor caused a furore amongst some MVPs with an article that attempted to offer simple advice to developers regarding the use of table variables, versus local and global temporary tables, in their code. Phil makes clear that the table variables do come with some fairly major limitations.no distribution statistics, no parallel query plans for queries that modify table variables.but goes on to suggest that for reasonably small-scale strategic uses, and with a bit of due care and testing, table variables are a "good thing". Not everyone shares his opinion; in fact, I imagine he was rather aghast to learn that there were those felt his article was akin to pulling the pin out of a grenade and tossing it into the database; table variables should be avoided in almost all cases, according to their advice, in favour of temp tables. In other words, a fairly major feature of SQL Server should be more-or-less 'off limits' to developers. The problem with temp tables is that, because they are scoped either in the procedure or the connection, it is easy to allow them to hang around for too long, eating up precious memory and bulking up the shared tempdb database. Unless they are explicitly dropped, global temporary tables, and local temporary tables created within a connection rather than within a stored procedure, will persist until the connection is closed or, with connection pooling, until the connection is reused. It's also quite common with ASP.NET applications to have connection leaks, as Bill Vaughn explains in his chapter in the "SQL Server Deep Dives" book, meaning that the web page exits without closing the connection object, maybe due to an error condition. This will then hang around in the heap for what might be hours before picked up by the garbage collector. Table variables are much safer in this regard, since they are batch-scoped and so are cleaned up automatically once the batch is complete, which also means that they are intuitive to use for the developer because they conform to scoping rules that are closer to those in procedural code. On the surface then, an ideal way to deal with issues related to tempdb memory hogging. So why did Phil qualify his recommendation to use Table Variables? This is another of those cases where, like scalar UDFs and table-valued multi-statement UDFs, developers can sometimes get into trouble with a relatively benign-looking feature, due to way it's been implemented in SQL Server. Once again the biggest problem is how they are handled internally, by the SQL Server query optimizer, which can make very poor choices for JOIN orders and so on, in the absence of statistics, especially when joining to tables with highly-skewed data. The resulting execution plans can be horrible, as will be the resulting performance. If the JOIN is to a large table, that will hurt. Ideally, Microsoft would simply fix this issue so that developers can't get burned in this way; they've been around since SQL Server 2000, so Microsoft has had a bit of time to get it right. As I commented in regard to UDFs, when developers discover issues like with such standard features, the database becomes an alien planet to them, where death lurks around each corner, and they continue to avoid these "killer" features years after the problems have been eventually resolved. In the meantime, what is the right approach? Is it to say "hammers can kill, don't ever use hammers", or is it to try to explain, as Phil's article and follow-up blog post have tried to do, what the feature was intended for, why care must be applied in its use, and so enable developers to make properly-informed decisions, without requiring them to delve deep into the inner workings of SQL Server? Cheers, Tony.

Read the article
Anatomy of a serialization killer

- by Brian Donahue

As I had mentioned last month, I have been working on a project to create an easy-to-use managed debugger. It's still an internal tool that we use at Red Gate as part of product support to analyze application errors on customer's computers, and as such, should be easy to use and not require installation. Since the project has got rather large and important, I had decided to use SmartAssembly to protect all of my hard work. This was trivial for the most part, but the loading and saving of results was broken by SA after using the obfuscation, rendering the loading and saving of XML results basically useless, although the merging and error reporting was an absolute godsend and definitely worth the price of admission. (Well, I get my Red Gate licenses for free, but you know what I mean!)My initial reaction was to simply exclude the serializable results class and all of its' members from obfuscation, and that was just dandy, but a few weeks on I decided to look into exactly why serialization had broken and change the code to work with SA so I could write any new code to be compatible with SmartAssembly and save me some additional testing and changes to the SA project.In simple terms, SA does all that it can to prevent serialization problems, for instance, it will not obfuscate public members of a DLL and it will exclude any types with the Serializable attribute from obfuscation. This prevents public members and properties from being made private and having the name changed. If the serialization is done inside the executable, however, public members have the access changed to private and are renamed. That was my first problem, because my types were in the executable assembly and implemented ISerializable, but did not have the Serializable attribute set on them!public class RedFlagResults : ISerializable { }The second problem caused by the pruning feature. Although RedFlagResults had public members, they were not truly properties, and used the GetObjectData() method of ISerializable to serialize the members. For that reason, SA could not exclude these members from pruning and further broke the serialization. public class RedFlagResults : ISerializable { public List<RedFlag.Exception> Exceptions; #region ISerializable Members public void GetObjectData(SerializationInfo info, StreamingContext context) { info.AddValue("Exceptions", Exceptions); } #endregionSo to fix this, it was necessary to make Exceptions a proper property by implementing get and set on it. Also, I added the Serializable attribute so that I don't have to exclude the class from obfuscation in the SA project any more. The DoNotPrune attribute means I do not need to exclude the class from pruning.[Serializable, SmartAssembly.Attributes.DoNotPrune] public class RedFlagResults { public List<RedFlag.Exception> Exceptions {get;set;} }Similarly, the Exception class gets the Serializable and DoNotPrune attributes applied so all of its' properties are excluded from obfuscation.Now my project has some protection from prying eyes by scrambling up the code so it's harder to reverse-engineer, without breaking anything. SmartAssembly has also provided the benefit of merging so that the end-user doesn't need to extract all of the DLL files needed by RedFlag into a directory, and can be run directly from the .zip archive. When an error occurs (hey, I'm only human!), an exception report can be sent to me so I can see what went wrong without having to, er, debug the debugger.

Read the article
SQL Server source control from Visual Studio

- by David Atkinson

Developers have long since had to context switch between two IDEs, Visual Studio for application code development and SQL Server Management Studio for database development. While this is accepted, especially given the richness of the database development feature set in SSMS, loading a separate tool can seem a little overkill. This is where SQL Connect comes in. This is an add-in to Visual Studio that provides a connected development experience for the SQL Server developer. Connected database development involves modifying a development sandbox database, as opposed to offline development, where SQL text files are modified independently of the database. One of the main complaints of Data Dude (VS DBPro) is that it enforces the offline approach. This gripe is what SQL Connect addresses. If you don't already use SQL Source Control, you can get up and running with SQL Connect by adding a new project to your Visual Studio solution as follows: Then choose your existing development database and you're ready to go. If you already use SQL Source Control, you will need to link SQL Connect to your existing database scripts folder repository, so SQL Connect and SQL Source Control can be used collaboratively (note that SQL Source Control v.3.0.9.18 or later is required). Locate the repository (this can be found in the Setup tab in SQL Source Control). .and create a working folder for it (here I'm using TortoiseSVN). Back in Visual Studio, locate the SQL Connect panel (in the View menu if it hasn't auto loaded) and select Import SQL Source Control project Locate your working folder and click Import. This creates a Red Gate database project under your solution: From here you can modify your development database, and manage your changes in source control. To associate your development database with the project, right click on the project node, select Properties, set the database and Save. Now you're ready to make some changes. Locate the object you'd like to modify in the Solution Explorer, and double click it to invoke a query window or table designer. You also have the option to edit the creation SQL directly using Edit SQL File in Project. Keeping the development database and Visual Studio project in sync is as easy as clicking on a button. One you've made your change, you can use whichever mechanism you choose to commit to source control. Here I'm using the free open-source AnkhSVN to integrate Subversion with Visual Studio. Maintaining your database in a Visual Studio solution means that you can commit database changes and application code changes in the same changeset. This is desirable if you have continuous integration set up as you want to ensure that all files related to a change are committed atomically, so you avoid an interim "broken build". More discussion on SQL Connect and its benefits can be found in the following article on Simple Talk: No More Disconnected SQL Development in Visual Studio The SQL Connect project team is currently assessing the backlog for the next development effort, and they'd appreciate your feature suggestions, as well as your votes on their suggestions site: http://redgate.uservoice.com/forums/140800-sql-connect-for-visual-studio- A 28-day free trial of SQL Connect is available from the Red Gate website. Technorati Tags: SQL Server

Read the article
Inappropriate Updates?

- by Tony Davis

A recent Simple-talk article by Kathi Kellenberger dissected the fastest SQL solution, submitted by Peter Larsson as part of Phil Factor's SQL Speed Phreak challenge, to the classic "running total" problem. In its analysis of the code, the article re-ignited a heated debate regarding the techniques that should, and should not, be deemed acceptable in your search for fast SQL code. Peter's code for running total calculation uses a variation of a somewhat contentious technique, sometimes referred to as a "quirky update": SET @Subscribers = Subscribers = @Subscribers + PeopleJoined - PeopleLeft This form of the UPDATE statement, @variable = column = expression, is documented and it allows you to set a variable to the value returned by the expression. Microsoft does not guarantee the order in which rows are updated in this technique because, in relational theory, a table doesn’t have a natural order to its rows and the UPDATE statement has no means of specifying the order. Traditionally, in cases where a specific order is requires, such as for running aggregate calculations, programmers who used the technique have relied on the fact that the UPDATE statement, without the WHERE clause, is executed in the order imposed by the clustered index, or in heap order, if there isn’t one. Peter wasn’t satisfied with this, and so used the ingenious device of assuring the order of the UPDATE by the use of an "ordered CTE", based on an underlying temporary staging table (a heap). However, in either case, the ordering is still not guaranteed and, in addition, would be broken under conditions of parallelism, or partitioning. Many argue, with validity, that this reliance on a given order where none can ever be guaranteed is an abuse of basic relational principles, and so is a bad practice; perhaps even irresponsible. More importantly, Microsoft doesn't wish to support the technique and offers no guarantee that it will always work. If you put it into production and it breaks in a later version, you can't file a bug. As such, many believe that the technique should never be tolerated in a production system, under any circumstances. Is this attitude justified? After all, both forms of the technique, using a clustered index to guarantee the order or using an ordered CTE, have been tested rigorously and are proven to be robust; although not guaranteed by Microsoft, the ordering is reliable, provided none of the conditions that are known to break it are violated. In Peter's particular case, the technique is being applied to a temporary table, where the developer has full control of the data ordering, and indexing, and knows that the table will never be subject to parallelism or partitioning. It might be argued that, in such circumstances, the technique is not really "quirky" at all and to ban it from your systems would server no real purpose other than to deprive yourself of a reliable technique that has uses that extend well beyond the running total calculations. Of course, it is doubly important that such a technique, including its unsupported status and the assumptions that underpin its success, is fully and clearly documented, preferably even when posting it online in a competition or forum post. Ultimately, however, this technique has been available to programmers throughout the time Sybase and SQL Server has existed, and so cannot be lightly cast aside, even if one sympathises with Microsoft for the awkwardness of maintaining an archaic way of doing updates. After all, a Table hint could easily be devised that, if specified in the WITH (<Table_Hint_Limited>) clause, could be used to request the database engine to do the update in the conventional order. Then perhaps everyone would be satisfied. Cheers, Tony.

Read the article
F1 Pit Pragmatics

- by mikef

"I hate computers. No, really, I hate them. I love the communications they facilitate, I love the conveniences they provide to my life. but I actually hate the computers themselves." - Scott Merrill, 'I hate computers: confessions of a Sysadmin' If Scott's goal was to polarize opinion and trigger raging arguments over the 'real reasons why computers suck', then he certainly succeeded. Impassioned vitriol sits side-by-side with rational debate. Yet Scott's fundamental point is absolutely on the money - Computers are a means to an end. The IT industry is finally starting to put weight behind the notion that good User Experience is an absolutely crucial goal, a cause championed by the likes of Microsoft's Bill Buxton, and which Apple's increasingly ubiquitous touch screen interface exemplifies. However, that doesn't change the fact that, occasionally, you just have to man up and deal with complex systems. In fact, sometimes you just need to sacrifice everything else in the name of performance. You'll find a perfect example of this Faustian bargain in Trevor Clarke's fascinating look into the (diabolical) IT infrastructure of modern F1 racing - high performance, high availability. high everything. To paraphrase, each car has up to 100 sensors, transmitting around 30Gb of data over the course of a race (70% in real-time). This data is then processed by no less than 3 servers (per car) so that the engineers in the pit have access to telemetry, strategy information, timing feeds, a connection back to the operations room in the team's home base - the list goes on. All of this while the servers are exposed "to carbon dust, oil, vibration, rain, heat, [and] variable power". Now, this is admittedly an extreme context where there's no real choice but to use complex systems where ease-of-use is, at best, a secondary concern. The flip-side is seen in small-scale personal computing such as that seen in Apple's iDevices, which are incredibly intuitive but limited in their scope. In terms of what kinds of systems they prefer to use, I suspect that most SysAdmins find themselves somewhere along this axis of Power vs. Usability, and which end of this axis you resonate with also hints at where you think the IT industry should focus its energy. Do you see yourself in the F1 pit, making split-second decisions, wrestling with information flows and reticent hardware to bend them to your will? If so, I imagine you feel that computers are subtle tools which need to be tuned and honed, using the advanced knowledge possessed only by responsible SysAdmins (If you have an iPhone, I suspect it's jail-broken). If the machines throw enigmatic errors, it's the price of flexibility and raw power. Alternatively, would you prefer to have your role more accessible, with users empowered by knowledge, spreading the load of managing IT environments? In that case, then you want hardware and software to have User Experience as their primary focus, and are of the "means to an end" school of thought (you're probably also fed up with users not listening to you when you try and help). At its heart, the dichotomy is between raw power (which might be difficult to use) and ease-of-use (which might have some limitations, but you can be up and running immediately). Of course, the ultimate goal is a fusion of flexibility, power and usability all in one system. It's achievable in specific software environments, and Red Gate considers it a target worth aiming for, but in other cases it's a goal right up there with cold fusion. I think it'll be a long time before we see it become ubiquitous. In the meantime, are you Power-Hungry or a Champion of Usability? Cheers, Michael Francis Simple Talk SysAdmin Editor

Read the article
Using Live Data in Database Development Work

- by Phil Factor

Guest Editorial for Simple-Talk Newsletter... in which Phil Factor reacts with some exasperation when coming across a report that a majority of companies were still using financial and personal data for both developing and testing database applications. If you routinely test your development work using real production data that contains personal or financial information, you are probably being irresponsible, and at worst, risking a heavy financial penalty for your company. Surprisingly, over 80% of financial companies still do this. Plenty of data breaches and fraud have happened from the use of real data for testing, and a data breach is a nightmare for any organisation that suffers one. The cost of each data breach averages out at around $7.2 million in the US in notification, escalation, credit monitoring, fines, litigation, legal costs, and lost business due to customer churn, £1.9 million in the UK. 70% of data breaches are done from within the organisation. Real data can be exploited in a number of ways for malicious or criminal purposes. It isn't just the obvious use of items such as name and address, date of birth, social security number, and credit card and bank account numbers: Data can be exploited in many subtle ways, so there are excellent reasons to ensure that a high priority is given to the detection and prevention of any data breaches. You'll never successfully guess all the ways that real data can be exploited maliciously, or the ease with which it can be accessed. It would be silly to argue that developers never need access to a copy of the database containing live data. Developers sometimes need to track a bug that can only be replicated on the data from the live database. However, it has to be done in a very restrictive harness. The law makes no distinction between development and production databases when a data breach occurs, so the data has to be held with all appropriate security measures in place. In Europe, the use of personal data for testing requires the explicit consent of the people whose data is being held. There are federal standards such as GLBA, PCI DSS and HIPAA, and most US States have privacy legislation. The task of ensuring compliance and tight security in such circumstances is an expensive and time-consuming overhead. The developer is likely to suffer investigation if a data breach occurs, even if the company manages to stay in business. Ironically, the use of copies of live data isn't usually the most effective way to develop or test your data. Data is usually time-specific and isn't usually current by the time it is used for testing, Existing data doesn't help much for new functionality, and every time the data is refreshed from production, any test data is likely to be overwritten. Also, it is not always going to test all the 'edge' conditions that are likely to flush out bugs. You still have the task of simulating the dynamics of actual usage of the database, and here you have no alternative to creating 'spoofed' data. Because of the complexities of relational data, It used to be that there was no realistic alternative to developing and testing with live data. However, this is no longer the case. Real data can be obfuscated, or it can be created entirely from scratch. The latter process used to be impractical, now that there are plenty of third-party tools to choose from. The process of obfuscation isn't risk free. The process must access the live data, and the success of the obfuscation process has to be carefully monitored. Database data security isn't an exciting topic to you or I, but to a hacker it can be an all-consuming obsession, especially if there is financial or political gain involved. This is not the sort of adversary one would wish for and it is far better to accept, and work with, security restrictions that exist for using live data in database development work, especially when the tools exist to create large realistic database test data that can be better for several aspects of testing.

Read the article

Search Results

Search found 47615 results on 1905 pages for 'make it useful keep it simple'.

Page 161/1905 | < Previous Page | 157 158 159 160 161 162 163 164 165 166 167 168 | Next Page >

- by Tony Davis

- by Simon Cooper

- by Chris George

- by laerte

- by Johnm

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Jason Crease

- by Simon Cooper

- by Phil Factor

- by Red Gate Software BI Tools Team

- by mikef

- by Tom Ritter

- by Mark Virtue

- by Michael Williamson

- by Laila

- by Simon Cooper

- by Tony Davis

- by Tony Davis

- by Brian Donahue

- by David Atkinson

- by Tony Davis

- by mikef

- by Phil Factor

< Previous Page | 157 158 159 160 161 162 163 164 165 166 167 168 | Next Page >