Search Results

Search found 933 results on 38 pages for 'simon cooper'.

Page 4/38 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Subterranean IL: Compiling C# exception handlers

- by Simon Cooper

An exception handler in C# combines the IL catch and finally exception handling clauses into a single try statement: try { Console.WriteLine("Try block") // ... } catch (IOException) { Console.WriteLine("IOException catch") // ... } catch (Exception e) { Console.WriteLine("Exception catch") // ... } finally { Console.WriteLine("Finally block") // ... } How does this get compiled into IL? Initial implementation If you remember from my earlier post, finally clauses must be specified with their own .try clause. So, for the initial implementation, we take the try/catch/finally, and simply split it up into two .try clauses (I have to use label syntax for this): StartTry: ldstr "Try block" call void [mscorlib]System.Console::WriteLine(string) // ... leave.s End EndTry: StartIOECatch: ldstr "IOException catch" call void [mscorlib]System.Console::WriteLine(string) // ... leave.s End EndIOECatch: StartECatch: ldstr "Exception catch" call void [mscorlib]System.Console::WriteLine(string) // ... leave.s End EndECatch: StartFinally: ldstr "Finally block" call void [mscorlib]System.Console::WriteLine(string) // ... endfinally EndFinally: End: // ... .try StartTry to EndTry catch [mscorlib]System.IO.IOException handler StartIOECatch to EndIOECatch catch [mscorlib]System.Exception handler StartECatch to EndECatch .try StartTry to EndTry finally handler StartFinally to EndFinally However, the resulting program isn't verifiable, and doesn't run: [IL]: Error: Shared try has finally or fault handler. Nested try blocks What's with the verification error? Well, it's a condition of IL verification that all exception handling regions (try, catch, filter, finally, fault) of a single .try clause have to be completely contained within any outer exception region, and they can't overlap with any other exception handling clause. In other words, IL exception handling clauses must to be representable in the scoped syntax, and in this example, we're overlapping catch and finally clauses. Not only is this example not verifiable, it isn't semantically correct. The finally handler is specified round the .try. What happens if you were able to run this code, and an exception was thrown? Program execution enters top of try block, and exception is thrown within it CLR searches for an exception handler, finds catch Because control flow is leaving .try, finally block is run The catch block is run leave.s End inside the catch handler branches to End label. We're actually running the finally before the catch! What we do about it What we actually need to do is put the catch clauses inside the finally clause, as this will ensure the finally gets executed at the correct time (this time using scoped syntax): .try { .try { ldstr "Try block" call void [mscorlib]System.Console::WriteLine(string) // ... leave.s End } catch [mscorlib]System.IO.IOException { ldstr "IOException catch" call void [mscorlib]System.Console::WriteLine(string) // ... leave.s End } catch [mscorlib]System.Exception { ldstr "Exception catch" call void [mscorlib]System.Console::WriteLine(string) // ... leave.s End } } finally { ldstr "Finally block" call void [mscorlib]System.Console::WriteLine(string) // ... endfinally } End: ret Returning from methods There is a further semantic mismatch that the C# compiler has to deal with; in C#, you are allowed to return from within an exception handling block: public int HandleMethod() { try { // ... return 0; } catch (Exception) { // ... return -1; } } However, you can't ret inside an exception handling block in IL. So the C# compiler does a leave.s to a ret outside the exception handling area, loading/storing any return value to a local variable along the way (as leave.s clears the stack): .method public instance int32 HandleMethod() { .locals init ( int32 retVal ) .try { // ... ldc.i4.0 stloc.0 leave.s End } catch [mscorlib]System.Exception { // ... ldc.i4.m1 stloc.0 leave.s End } End: ldloc.0 ret } Conclusion As you can see, the C# compiler has quite a few hoops to jump through to translate C# code into semantically-correct IL, and hides the numerous conditions on IL exception handling blocks from the C# programmer. Next up: catch-all blocks, and how the runtime deals with non-Exception exceptions.

Read the article
.NET vs Windows 8: Rematch!

- by Simon Cooper

So, although you will be able to use your existing .NET skills to develop Metro apps, it turns out Microsoft are limiting Visual Studio 2011 Express to Metro-only. From the Express website: Visual Studio 11 Express for Windows 8 provides tools for Metro style app development. To create desktop apps, you need to use Visual Studio 11 Professional, or higher. Oh dear. To develop any sort of non-Metro application, you will need to pay for at least VS Professional. I suspect Microsoft (or at least, certain groups within Microsoft) have a very explicit strategy in mind. By making VS Express Metro-only, developers who don't want to pay for Professional will be forced to make their simple one-shot or open-source application in Metro. This increases the number of applications available for Windows 8 and Windows mobile devices, which in turn make those platforms more attractive for consumers. When you use the free VS 11 Express, instead of paying Microsoft, you provide them a service by making applications for Metro, which in turn makes Microsoft's mobile offering more attractive to consumers, increasing their market share. Of course, it remains to be seen if developers forced to jump onto the Metro bandwagon will simply jump ship to Android or iOS instead. At least, that's what I think is going on. With Microsoft, who really knows?

Read the article
Some non-generic collections

- by Simon Cooper

Although the collections classes introduced in .NET 2, 3.5 and 4 cover most scenarios, there are still some .NET 1 collections that don't have generic counterparts. In this post, I'll be examining what they do, why you might use them, and some things you'll need to bear in mind when doing so. BitArray System.Collections.BitArray is conceptually the same as a List<bool>, but whereas List<bool> stores each boolean in a single byte (as that's what the backing bool[] does), BitArray uses a single bit to store each value, and uses various bitmasks to access each bit individually. This means that BitArray is eight times smaller than a List<bool>. Furthermore, BitArray has some useful functions for bitmasks, like And, Xor and Not, and it's not limited to 32 or 64 bits; a BitArray can hold as many bits as you need. However, it's not all roses and kittens. There are some fundamental limitations you have to bear in mind when using BitArray: It's a non-generic collection. The enumerator returns object (a boxed boolean), rather than an unboxed bool. This means that if you do this: foreach (bool b in bitArray) { ... } Every single boolean value will be boxed, then unboxed. And if you do this: foreach (var b in bitArray) { ... } you'll have to manually unbox b on every iteration, as it'll come out of the enumerator an object. Instead, you should manually iterate over the collection using a for loop: for (int i=0; i<bitArray.Length; i++) { bool b = bitArray[i]; ... } Following on from that, if you want to use BitArray in the context of an IEnumerable<bool>, ICollection<bool> or IList<bool>, you'll need to write a wrapper class, or use the Enumerable.Cast<bool> extension method (although Cast would box and unbox every value you get out of it). There is no Add or Remove method. You specify the number of bits you need in the constructor, and that's what you get. You can change the length yourself using the Length property setter though. It doesn't implement IList. Although not really important if you're writing a generic wrapper around it, it is something to bear in mind if you're using it with pre-generic code. However, if you use BitArray carefully, it can provide significant gains over a List<bool> for functionality and efficiency of space. OrderedDictionary System.Collections.Specialized.OrderedDictionary does exactly what you would expect - it's an IDictionary that maintains items in the order they are added. It does this by storing key/value pairs in a Hashtable (to get O(1) key lookup) and an ArrayList (to maintain the order). You can access values by key or index, and insert or remove items at a particular index. The enumerator returns items in index order. However, the Keys and Values properties return ICollection, not IList, as you might expect; CopyTo doesn't maintain the same ordering, as it copies from the backing Hashtable, not ArrayList; and any operations that insert or remove items from the middle of the collection are O(n), just like a normal list. In short; don't use this class. If you need some sort of ordered dictionary, it would be better to write your own generic dictionary combining a Dictionary<TKey, TValue> and List<KeyValuePair<TKey, TValue>> or List<TKey> for your specific situation. ListDictionary and HybridDictionary To look at why you might want to use ListDictionary or HybridDictionary, we need to examine the performance of these dictionaries compared to Hashtable and Dictionary<object, object>. For this test, I added n items to each collection, then randomly accessed n/2 items: So, what's going on here? Well, ListDictionary is implemented as a linked list of key/value pairs; all operations on the dictionary require an O(n) search through the list. However, for small n, the constant factor that big-o notation doesn't measure is much lower than the hashing overhead of Hashtable or Dictionary. HybridDictionary combines a Hashtable and ListDictionary; for small n, it uses a backing ListDictionary, but switches to a Hashtable when it gets to 9 items (you can see the point it switches from a ListDictionary to Hashtable in the graph). Apart from that, it's got very similar performance to Hashtable. So why would you want to use either of these? In short, you wouldn't. Any gain in performance by using ListDictionary over Dictionary<TKey, TValue> would be offset by the generic dictionary not having to cast or box the items you store, something the graphs above don't measure. Only if the performance of the dictionary is vital, the dictionary will hold less than 30 items, and you don't need type safety, would you use ListDictionary over the generic Dictionary. And even then, there's probably more useful performance gains you can make elsewhere.

Read the article
Anatomy of a .NET Assembly - CLR metadata 2

- by Simon Cooper

Before we look any further at the CLR metadata, we need a quick diversion to understand how the metadata is actually stored. Encoding table information As an example, we'll have a look at a row in the TypeDef table. According to the spec, each TypeDef consists of the following: Flags specifying various properties of the class, including visibility. The name of the type. The namespace of the type. What type this type extends. The field list of this type. The method list of this type. How is all this data actually represented? Offset & RID encoding Most assemblies don't need to use a 4 byte value to specify heap offsets and RIDs everywhere, however we can't hard-code every offset and RID to be 2 bytes long as there could conceivably be more than 65535 items in a heap or more than 65535 fields or types defined in an assembly. So heap offsets and RIDs are only represented in the full 4 bytes if it is required; in the header information at the top of the #~ stream are 3 bits indicating if the #Strings, #GUID, or #Blob heaps use 2 or 4 bytes (the #US stream is not accessed from metadata), and the rowcount of each table. If the rowcount for a particular table is greater than 65535 then all RIDs referencing that table throughout the metadata use 4 bytes, else only 2 bytes are used. Coded tokens Not every field in a table row references a single predefined table. For example, in the TypeDef extends field, a type can extend another TypeDef (a type in the same assembly), a TypeRef (a type in a different assembly), or a TypeSpec (an instantiation of a generic type). A token would have to be used to let us specify the table along with the RID. Tokens are always 4 bytes long; again, this is rather wasteful of space. Cutting the RID down to 2 bytes would make each token 3 bytes long, which isn't really an optimum size for computers to read from memory or disk. However, every use of a token in the metadata tables can only point to a limited subset of the metadata tables. For the extends field, we only need to be able to specify one of 3 tables, which we can do using 2 bits: 0x0: TypeDef 0x1: TypeRef 0x2: TypeSpec We could therefore compress the 4-byte token that would otherwise be needed into a coded token of type TypeDefOrRef. For each type of coded token, the least significant bits encode the table the token points to, and the rest of the bits encode the RID within that table. We can work out whether each type of coded token needs 2 or 4 bytes to represent it by working out whether the maximum RID of every table that the coded token type can point to will fit in the space available. The space available for the RID depends on the type of coded token; a TypeOrMethodDef coded token only needs 1 bit to specify the table, leaving 15 bits available for the RID before a 4-byte representation is needed, whereas a HasCustomAttribute coded token can point to one of 18 different tables, and so needs 5 bits to specify the table, only leaving 11 bits for the RID before 4 bytes are needed to represent that coded token type. For example, a 2-byte TypeDefOrRef coded token with the value 0x0321 has the following bit pattern: 0 3 2 1 0000 0011 0010 0001 The first two bits specify the table - TypeRef; the other bits specify the RID. Because we've used the first two bits, we've got to shift everything along two bits: 000000 1100 1000 This gives us a RID of 0xc8. If any one of the TypeDef, TypeRef or TypeSpec tables had more than 16383 rows (2^14 - 1), then 4 bytes would need to be used to represent all TypeDefOrRef coded tokens throughout the metadata tables. Lists The third representation we need to consider is 1-to-many references; each TypeDef refers to a list of FieldDef and MethodDef belonging to that type. If we were to specify every FieldDef and MethodDef individually then each TypeDef would be very large and a variable size, which isn't ideal. There is a way of specifying a list of references without explicitly specifying every item; if we order the MethodDef and FieldDef tables by the owning type, then the field list and method list in a TypeDef only have to be a single RID pointing at the first FieldDef or MethodDef belonging to that type; the end of the list can be inferred by the field list and method list RIDs of the next row in the TypeDef table. Going back to the TypeDef If we have a look back at the definition of a TypeDef, we end up with the following reprensentation for each row: Flags - always 4 bytes Name - a #Strings heap offset. Namespace - a #Strings heap offset. Extends - a TypeDefOrRef coded token. FieldList - a single RID to the FieldDef table. MethodList - a single RID to the MethodDef table. So, depending on the number of entries in the heaps and tables within the assembly, the rows in the TypeDef table can be as small as 14 bytes, or as large as 24 bytes. Now we've had a look at how information is encoded within the metadata tables, in the next post we can see how they are arranged on disk.

Read the article
Why unhandled exceptions are useful

- by Simon Cooper

It’s the bane of most programmers’ lives – an unhandled exception causes your application or webapp to crash, an ugly dialog gets displayed to the user, and they come complaining to you. Then, somehow, you need to figure out what went wrong. Hopefully, you’ve got a log file, or some other way of reporting unhandled exceptions (obligatory employer plug: SmartAssembly reports an application’s unhandled exceptions straight to you, along with the entire state of the stack and variables at that point). If not, you have to try and replicate it yourself, or do some psychic debugging to try and figure out what’s wrong. However, it’s good that the program crashed. Or, more precisely, it is correct behaviour. An unhandled exception in your application means that, somewhere in your code, there is an assumption that you made that is actually invalid. Coding assumptions Let me explain a bit more. Every method, every line of code you write, depends on implicit assumptions that you have made. Take this following simple method, that copies a collection to an array and includes an item if it isn’t in the collection already, using a supplied IEqualityComparer: public static T[] ToArrayWithItem( ICollection<T> coll, T obj, IEqualityComparer<T> comparer) { // check if the object is in collection already // using the supplied comparer foreach (var item in coll) { if (comparer.Equals(item, obj)) { // it's in the collection already // simply copy the collection to an array // and return it T[] array = new T[coll.Count]; coll.CopyTo(array, 0); return array; } } // not in the collection // copy coll to an array, and add obj to it // then return it T[] array = new T[coll.Count+1]; coll.CopyTo(array, 0); array[array.Length-1] = obj; return array; } What’s all the assumptions made by this fairly simple bit of code? coll is never null comparer is never null coll.CopyTo(array, 0) will copy all the items in the collection into the array, in the order defined for the collection, starting at the first item in the array. The enumerator for coll returns all the items in the collection, in the order defined for the collection comparer.Equals returns true if the items are equal (for whatever definition of ‘equal’ the comparer uses), false otherwise comparer.Equals, coll.CopyTo, and the coll enumerator will never throw an exception or hang for any possible input and any possible values of T coll will have less than 4 billion items in it (this is a built-in limit of the CLR) array won’t be more than 2GB, both on 32 and 64-bit systems, for any possible values of T (again, a limit of the CLR) There are no threads that will modify coll while this method is running and, more esoterically: The C# compiler will compile this code to IL according to the C# specification The CLR and JIT compiler will produce machine code to execute the IL on the user’s computer The computer will execute the machine code correctly That’s a lot of assumptions. Now, it could be that all these assumptions are valid for the situations this method is called. But if this does crash out with an exception, or crash later on, then that shows one of the assumptions has been invalidated somehow. An unhandled exception shows that your code is running in a situation which you did not anticipate, and there is something about how your code runs that you do not understand. Debugging the problem is the process of learning more about the new situation and how your code interacts with it. When you understand the problem, the solution is (usually) obvious. The solution may be a one-line fix, the rewrite of a method or class, or a large-scale refactoring of the codebase, but whatever it is, the fix for the crash will incorporate the new information you’ve gained about your own code, along with the modified assumptions. When code is running with an assumption or invariant it depended on broken, then the result is ‘undefined behaviour’. Anything can happen, up to and including formatting the entire disk or making the user’s computer sentient and start doing a good impression of Skynet. You might think that those can’t happen, but at Halting problem levels of generality, as soon as an assumption the code depended on is broken, the program can do anything. That is why it’s important to fail-fast and stop the program as soon as an invariant is broken, to minimise the damage that is done. What does this mean in practice? To start with, document and check your assumptions. As with most things, there is a level of judgement required. How you check and document your assumptions depends on how the code is used (that’s some more assumptions you’ve made), how likely it is a method will be passed invalid arguments or called in an invalid state, how likely it is the assumptions will be broken, how expensive it is to check the assumptions, and how bad things are likely to get if the assumptions are broken. Now, some assumptions you can assume unless proven otherwise. You can safely assume the C# compiler, CLR, and computer all run the method correctly, unless you have evidence of a compiler, CLR or processor bug. You can also assume that interface implementations work the way you expect them to; implementing an interface is more than simply declaring methods with certain signatures in your type. The behaviour of those methods, and how they work, is part of the interface contract as well. For example, for members of a public API, it is very important to document your assumptions and check your state before running the bulk of the method, throwing ArgumentException, ArgumentNullException, InvalidOperationException, or another exception type as appropriate if the input or state is wrong. For internal and private methods, it is less important. If a private method expects collection items in a certain order, then you don’t necessarily need to explicitly check it in code, but you can add comments or documentation specifying what state you expect the collection to be in at a certain point. That way, anyone debugging your code can immediately see what’s wrong if this does ever become an issue. You can also use DEBUG preprocessor blocks and Debug.Assert to document and check your assumptions without incurring a performance hit in release builds. On my coding soapbox… A few pet peeves of mine around assumptions. Firstly, catch-all try blocks: try { ... } catch { } A catch-all hides exceptions generated by broken assumptions, and lets the program carry on in an unknown state. Later, an exception is likely to be generated due to further broken assumptions due to the unknown state, causing difficulties when debugging as the catch-all has hidden the original problem. It’s much better to let the program crash straight away, so you know where the problem is. You should only use a catch-all if you are sure that any exception generated in the try block is safe to ignore. That’s a pretty big ask! Secondly, using as when you should be casting. Doing this: (obj as IFoo).Method(); or this: IFoo foo = obj as IFoo; ... foo.Method(); when you should be doing this: ((IFoo)obj).Method(); or this: IFoo foo = (IFoo)obj; ... foo.Method(); There’s an assumption here that obj will always implement IFoo. If it doesn’t, then by using as instead of a cast you’ve turned an obvious InvalidCastException at the point of the cast that will probably tell you what type obj actually is, into a non-obvious NullReferenceException at some later point that gives you no information at all. If you believe obj is always an IFoo, then say so in code! Let it fail-fast if not, then it’s far easier to figure out what’s wrong. Thirdly, document your assumptions. If an algorithm depends on a non-trivial relationship between several objects or variables, then say so. A single-line comment will do. Don’t leave it up to whoever’s debugging your code after you to figure it out. Conclusion It’s better to crash out and fail-fast when an assumption is broken. If it doesn’t, then there’s likely to be further crashes along the way that hide the original problem. Or, even worse, your program will be running in an undefined state, where anything can happen. Unhandled exceptions aren’t good per-se, but they give you some very useful information about your code that you didn’t know before. And that can only be a good thing.

Read the article
Subterranean IL: Generics and array covariance

- by Simon Cooper

Arrays in .NET are curious beasts. They are the only built-in collection types in the CLR, and SZ-arrays (single dimension, zero-indexed) have their own commands and IL syntax. One of their stranger properties is they have a kind of built-in covariance long before generic variance was added in .NET 4. However, this causes a subtle but important problem with generics. First of all, we need to briefly recap on array covariance. SZ-array covariance To demonstrate, I'll tweak the classes I introduced in my previous posts: public class IncrementableClass { public int Value; public virtual void Increment(int incrementBy) { Value += incrementBy; } } public class IncrementableClassx2 : IncrementableClass { public override void Increment(int incrementBy) { base.Increment(incrementBy); base.Increment(incrementBy); } } In the CLR, SZ-arrays of reference types are implicitly convertible to arrays of the element's supertypes, all the way up to object (note that this does not apply to value types). That is, an instance of IncrementableClassx2[] can be used wherever a IncrementableClass[] or object[] is required. When an SZ-array could be used in this fashion, a run-time type check is performed when you try to insert an object into the array to make sure you're not trying to insert an instance of IncrementableClass into an IncrementableClassx2[]. This check means that the following code will compile fine but will fail at run-time: IncrementableClass[] array = new IncrementableClassx2[1]; array[0] = new IncrementableClass(); // throws ArrayTypeMismatchException These checks are enforced by the various stelem* and ldelem* il instructions in such a way as to ensure you can't insert a IncrementableClass into a IncrementableClassx2[]. For the rest of this post, however, I'm going to concentrate on the ldelema instruction. ldelema This instruction pops the array index (int32) and array reference (O) off the stack, and pushes a pointer (&) to the corresponding array element. However, unlike the ldelem instruction, the instruction's type argument must match the run-time array type exactly. This is because, once you've got a managed pointer, you can use that pointer to both load and store values in that array element using the ldind* and stind* (load/store indirect) instructions. As the same pointer can be used for both input and output to the array, the type argument to ldelema must be invariant. At the time, this was a perfectly reasonable restriction, and maintained array type-safety within managed code. However, along came generics, and with it the constrained callvirt instruction. So, what happens when we combine array covariance and constrained callvirt? .method public static void CallIncrementArrayValue() { // IncrementableClassx2[] arr = new IncrementableClassx2[1] ldc.i4.1 newarr IncrementableClassx2 // arr[0] = new IncrementableClassx2(); dup newobj instance void IncrementableClassx2::.ctor() ldc.i4.0 stelem.ref // IncrementArrayValue<IncrementableClass>(arr, 0) // here, we're treating an IncrementableClassx2[] as IncrementableClass[] dup ldc.i4.0 call void IncrementArrayValue<class IncrementableClass>(!!0[],int32) // ... ret } .method public static void IncrementArrayValue<(IncrementableClass) T>( !!T[] arr, int32 index) { // arr[index].Increment(1) ldarg.0 ldarg.1 ldelema !!T ldc.i4.1 constrained. !!T callvirt instance void IIncrementable::Increment(int32) ret } And the result: Unhandled Exception: System.ArrayTypeMismatchException: Attempted to access an element as a type incompatible with the array. at IncrementArrayValue[T](T[] arr, Int32 index) at CallIncrementArrayValue() Hmm. We're instantiating the generic method as IncrementArrayValue<IncrementableClass>, but passing in an IncrementableClassx2[], hence the ldelema instruction is failing as it's expecting an IncrementableClass[]. On features and feature conflicts What we've got here is a conflict between existing behaviour (ldelema ensuring type safety on covariant arrays) and new behaviour (managed pointers to object references used for every constrained callvirt on generic type instances). And, although this is an edge case, there is no general workaround. The generic method could be hidden behind several layers of assemblies, wrappers and interfaces that make it a requirement to use array covariance when calling the generic method. Furthermore, this will only fail at runtime, whereas compile-time safety is what generics were designed for! The solution is the readonly. prefix instruction. This modifies the ldelema instruction to ignore the exact type check for arrays of reference types, and so it lets us take the address of array elements using a covariant type to the actual run-time type of the array: .method public static void IncrementArrayValue<(IncrementableClass) T>( !!T[] arr, int32 index) { // arr[index].Increment(1) ldarg.0 ldarg.1 readonly. ldelema !!T ldc.i4.1 constrained. !!T callvirt instance void IIncrementable::Increment(int32) ret } But what about type safety? In return for ignoring the type check, the resulting controlled mutability pointer can only be used in the following situations: As the object parameter to ldfld, ldflda, stfld, call and constrained callvirt instructions As the pointer parameter to ldobj or ldind* As the source parameter to cpobj In other words, the only operations allowed are those that read from the pointer; stind* and similar that alter the pointer itself are banned. This ensures that the array element we're pointing to won't be changed to anything untoward, and so type safety within the array is maintained. This is a typical example of the maxim that whenever you add a feature to a program, you have to consider how that feature interacts with every single one of the existing features. Although an edge case, the readonly. prefix instruction ensures that generics and array covariance work together and that compile-time type safety is maintained. Tune in next time for a look at the .ctor generic type constraint, and what it means.

Read the article
PostSharp, Obfuscation, and IL

- by Simon Cooper

Aspect-oriented programming (AOP) is a relatively new programming paradigm. Originating at Xerox PARC in 1994, the paradigm was first made available for general-purpose development as an extension to Java in 2001. From there, it has quickly been adapted for use in all the common languages used today. In the .NET world, one of the primary AOP toolkits is PostSharp. Attributes and AOP Normally, attributes in .NET are entirely a metadata construct. Apart from a few special attributes in the .NET framework, they have no effect whatsoever on how a class or method executes within the CLR. Only by using reflection at runtime can you access any attributes declared on a type or type member. PostSharp changes this. By declaring a custom attribute that derives from PostSharp.Aspects.Aspect, applying it to types and type members, and running the resulting assembly through the PostSharp postprocessor, you can essentially declare 'clever' attributes that change the behaviour of whatever the aspect has been applied to at runtime. A simple example of this is logging. By declaring a TraceAttribute that derives from OnMethodBoundaryAspect, you can automatically log when a method has been executed: public class TraceAttribute : PostSharp.Aspects.OnMethodBoundaryAspect { public override void OnEntry(MethodExecutionArgs args) { MethodBase method = args.Method; System.Diagnostics.Trace.WriteLine( String.Format( "Entering {0}.{1}.", method.DeclaringType.FullName, method.Name)); } public override void OnExit(MethodExecutionArgs args) { MethodBase method = args.Method; System.Diagnostics.Trace.WriteLine( String.Format( "Leaving {0}.{1}.", method.DeclaringType.FullName, method.Name)); } } [Trace] public void MethodToLog() { ... } Now, whenever MethodToLog is executed, the aspect will automatically log entry and exit, without having to add the logging code to MethodToLog itself. PostSharp Performance Now this does introduce a performance overhead - as you can see, the aspect allows access to the MethodBase of the method the aspect has been applied to. If you were limited to C#, you would be forced to retrieve each MethodBase instance using Type.GetMethod(), matching on the method name and signature. This is slow. Fortunately, PostSharp is not limited to C#. It can use any instruction available in IL. And in IL, you can do some very neat things. Ldtoken C# allows you to get the Type object corresponding to a specific type name using the typeof operator: Type t = typeof(Random); The C# compiler compiles this operator to the following IL: ldtoken [mscorlib]System.Random call class [mscorlib]System.Type [mscorlib]System.Type::GetTypeFromHandle( valuetype [mscorlib]System.RuntimeTypeHandle) The ldtoken instruction obtains a special handle to a type called a RuntimeTypeHandle, and from that, the Type object can be obtained using GetTypeFromHandle. These are both relatively fast operations - no string lookup is required, only direct assembly and CLR constructs are used. However, a little-known feature is that ldtoken is not just limited to types; it can also get information on methods and fields, encapsulated in a RuntimeMethodHandle or RuntimeFieldHandle: // get a MethodBase for String.EndsWith(string) ldtoken method instance bool [mscorlib]System.String::EndsWith(string) call class [mscorlib]System.Reflection.MethodBase [mscorlib]System.Reflection.MethodBase::GetMethodFromHandle( valuetype [mscorlib]System.RuntimeMethodHandle) // get a FieldInfo for the String.Empty field ldtoken field string [mscorlib]System.String::Empty call class [mscorlib]System.Reflection.FieldInfo [mscorlib]System.Reflection.FieldInfo::GetFieldFromHandle( valuetype [mscorlib]System.RuntimeFieldHandle) These usages of ldtoken aren't usable from C# or VB, and aren't likely to be added anytime soon (Eric Lippert's done a blog post on the possibility of adding infoof, methodof or fieldof operators to C#). However, PostSharp deals directly with IL, and so can use ldtoken to get MethodBase objects quickly and cheaply, without having to resort to string lookups. The kicker However, there are problems. Because ldtoken for methods or fields isn't accessible from C# or VB, it hasn't been as well-tested as ldtoken for types. This has resulted in various obscure bugs in most versions of the CLR when dealing with ldtoken and methods, and specifically, generic methods and methods of generic types. This means that PostSharp was behaving incorrectly, or just plain crashing, when aspects were applied to methods that were generic in some way. So, PostSharp has to work around this. Without using the metadata tokens directly, the only way to get the MethodBase of generic methods is to use reflection: Type.GetMethod(), passing in the method name as a string along with information on the signature. Now, this works fine. It's slower than using ldtoken directly, but it works, and this only has to be done for generic methods. Unfortunately, this poses problems when the assembly is obfuscated. PostSharp and Obfuscation When using ldtoken, obfuscators don't affect how PostSharp operates. Because the ldtoken instruction directly references the type, method or field within the assembly, it is unaffected if the name of the object is changed by an obfuscator. However, the indirect loading used for generic methods was breaking, because that uses the name of the method when the assembly is put through the PostSharp postprocessor to lookup the MethodBase at runtime. If the name then changes, PostSharp can't find it anymore, and the assembly breaks. So, PostSharp needs to know about any changes an obfuscator does to an assembly. The way PostSharp does this is by adding another layer of indirection. When PostSharp obfuscation support is enabled, it includes an extra 'name table' resource in the assembly, consisting of a series of method & type names. When PostSharp needs to lookup a method using reflection, instead of encoding the method name directly, it looks up the method name at a fixed offset inside that name table: MethodBase genericMethod = typeof(ContainingClass).GetMethod(GetNameAtIndex(22)); PostSharp.NameTable resource: ... 20: get_Prop1 21: set_Prop1 22: DoFoo 23: GetWibble When the assembly is later processed by an obfuscator, the obfuscator can replace all the method and type names within the name table with their new name. That way, the reflection lookups performed by PostSharp will now use the new names, and everything will work as expected: MethodBase genericMethod = typeof(#kGy).GetMethod(GetNameAtIndex(22)); PostSharp.NameTable resource: ... 20: #kkA 21: #zAb 22: #EF5a 23: #2tg As you can see, this requires direct support by an obfuscator in order to perform these rewrites. Dotfuscator supports it, and now, starting with SmartAssembly 6.6.4, SmartAssembly does too. So, a relatively simple solution to a tricky problem, with some CLR bugs thrown in for good measure. You don't see those every day!

Read the article
Subterranean IL: Volatile

- by Simon Cooper

This time, we'll be having a look at the volatile. prefix instruction, and one of the differences between volatile in IL and C#. The volatile. prefix volatile is a tricky one, as there's varying levels of documentation on it. From what I can see, it has two effects: It prevents caching of the load or store value; rather than reading or writing to a cached version of the memory location (say, the processor register or cache), it forces the value to be loaded or stored at the 'actual' memory location, so it is then immediately visible to other threads. It forces a memory barrier at the prefixed instruction. This ensures instructions don't get re-ordered around the volatile instruction. This is slightly more complicated than it first seems, and only seems to matter on certain architectures. For more details, Joe Duffy has a blog post going into the details. For this post, I'll be concentrating on the first aspect of volatile. Caching field accesses To demonstrate this, I created a simple multithreaded IL program. It boils down to the following code: .class public Holder { .field public static class Holder holder .field public bool stop .method public static specialname void .cctor() { newobj instance void Holder::.ctor() stsfld class Holder Holder::holder ret }}.method private static void Main() { .entrypoint // Thread t = new Thread(new ThreadStart(DoWork)) // t.Start() // Thread.Sleep(2000) // Console.WriteLine("Stopping thread...") ldsfld class Holder Holder::holder ldc.i4.1 stfld bool Holder::stop call instance void [mscorlib]System.Threading.Thread::Join() ret}.method private static void DoWork() { ldsfld class Holder Holder::holder // while (!Holder.holder.stop) {} DoWork: dup ldfld bool Holder::stop brfalse DoWork pop ret} If you compile and run this code, you'll find that the call to Thread.Join() never returns - the DoWork spinlock is reading a cached version of Holder.stop, which is never being updated with the new value set by the Main method. Adding volatile to the ldfld fixes this: dupvolatile.ldfld bool Holder::stopbrfalse DoWork The volatile ldfld forces the field access to read direct from heap memory, which is then updated by the main thread, rather than using a cached copy. volatile in C# This highlights one of the differences between IL and C#. In IL, volatile only applies to the prefixed instruction, whereas in C#, volatile is specified on a field to indicate that all accesses to that field should be volatile (interestingly, there's no mention of the 'no caching' aspect of volatile in the C# spec; it only focuses on the memory barrier aspect). Furthermore, this information needs to be stored within the assembly somehow, as such a field might be accessed directly from outside the assembly, but there's no concept of a 'volatile field' in IL! How this information is stored with the field will be the subject of my next post.

Read the article
Developing Schema Compare for Oracle (Part 1)

- by Simon Cooper

SQL Compare is one of Red Gate's most successful SQL Server tools; it allows developers and DBAs to compare and synchronize the contents of their databases. Although similar tools exist for Oracle, they are quite noticeably lacking in the usability and stability that SQL Compare is known for in the SQL Server world. We could see a real need for a usable schema comparison tools for Oracle, and so the Schema Compare for Oracle project was born. Over the next few weeks, as we come up to release of v1, I'll be doing a series of posts on the development of Schema Compare for Oracle. For the first post, I thought I would start with the main pitfalls that we stumbled across when developing the product, especially from a SQL Server background. 1. Schemas and Databases The most obvious difference is that the concept of a 'database' is quite different between Oracle and SQL Server. On SQL Server, one server instance has multiple databases, each with separate schemas. There is typically little communication between separate databases, and most databases are no more than about 1000-2000 objects. This means SQL Compare can register an entire database in a reasonable amount of time, and cross-database dependencies probably won't be an issue. It is a quite different scene under Oracle, however. The terms 'database' and 'instance' are used interchangeably, (although technically 'database' refers to the datafiles on disk, and 'instance' the running Oracle process that reads & writes to the database), and a database is a single conceptual entity. This immediately presents problems, as it is infeasible to register an entire database as we do in SQL Compare; in my Oracle install, using the standard recommended options, there are 63975 system objects. If we tried to register all those, not only would it take hours, but the client would probably run out of memory before we finished. As a result, we had to allow people to specify what schemas they wanted to register. This decision had quite a few knock-on effects for the design, which I will cover in a future post. 2. Connecting to Oracle The next obvious difference is in actually connecting to Oracle – in SQL Server, you can specify a server and database, and off you go. On Oracle things are slightly more complicated. SIDs, Service Names, and TNS A database (the files on disk) must have a unique identifier for the databases on the system, called the SID. It also has a global database name, which consists of a name (which doesn't have to match the SID) and a domain. Alternatively, you can identify a database using a service name, which normally has a 1-to-1 relationship with instances, but may not if, for example, using RAC (Real Application Clusters) for redundancy and failover. You specify the computer and instance you want to connect to using TNS (Transparent Network Substrate). The user-visible parts are a config file (tnsnames.ora) on the client machine that specifies how to connect to an instance. For example, the entry for one of my test instances is: SC_11GDB1 = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = simonctest)(PORT = 1521)) ) (CONNECT_DATA = (SID = 11gR1db1) ) ) This gives the hostname, port, and SID of the instance I want to connect to, and associates it with a name (SC_11GDB1). The tnsnames syntax also allows you to specify failover, multiple descriptions and address lists, and client load balancing. You can then specify this TNS identifier as the data source in a connection string. Although using ODP.NET (the .NET dlls provided by Oracle) was fine for internal prototype builds, once we released the EAP we discovered that this simply wasn't an acceptable solution for installs on other people's machines. Due to .NET assembly strong naming, users had to have installed on their machines the exact same version of the ODP.NET dlls as we had on our build server. We couldn't ship the ODP.NET dlls with our installer as the Oracle license agreement prohibited this, and we didn't want to force users to install another Oracle client just so they can run our program. To be able to list the TNS entries in the connection dialog, we also had to locate and parse the tnsnames.ora file, which was complicated by users with several Oracle client installs and intricate TNS entries. After much swearing at our computers, we eventually decided to use a third party Oracle connection library from Devart that we could ship with our program; this could use whatever client version was installed, parse the TNS entries for us, and also had the nice feature of being able to connect to an Oracle server without having any client installed at all. Unfortunately, their current license agreement prevents us from shipping an Oracle SDK, but that's a bridge we'll cross when we get to it. 3. Running synchronization scripts The most important difference is that in Oracle, DDL is non-transactional; you cannot rollback DDL statements like you can on SQL Server. Although we considered various solutions to this, including using the flashback archive or recycle bin, or generating an undo script, no reliable method of completely undoing a half-executed sync script has yet been found; so in this case we simply have to trust that the DBA or developer will check and verify the script before running it. However, before we got to that stage, we had to get the scripts to run in the first place... To run a synchronization script from SQL Compare we essentially pass the script over to the SqlCommand.ExecuteNonQuery method. However, when we tried to do the same for an OracleConnection we got a very strange error – 'ORA-00911: invalid character', even when running the most basic CREATE TABLE command. After much hair-pulling and Googling, we discovered that Oracle has got some very strange behaviour with semicolons at the end of statements. To understand what's going on, we need to take a quick foray into SQL and PL/SQL. PL/SQL is not T-SQL In SQL Server, T-SQL is the language used to interface with the database. It has DDL, DML, control flow, and many other nice features (like Turing-completeness) that you can mix and match in the same script. In Oracle, DDL SQL and PL/SQL are two completely separate languages, with different syntax, different datatypes and different execution engines within the instance. Oracle SQL is much more like 'pure' ANSI SQL, with no state, no control flow, and only the basic DML commands. PL/SQL is the Turing-complete language, but can only do DML and DCL (i.e. BEGIN TRANSATION commands). Any DDL or SQL commands that aren't recognised by the PL/SQL engine have to be passed back to the SQL engine via an EXECUTE IMMEDIATE command. In PL/SQL, a semicolons is a valid token used to delimit the end of a statement. In SQL, a semicolon is not a valid token (even though the Oracle documentation gives them at the end of the syntax diagrams) . When you execute the command CREATE TABLE table1 (COL1 NUMBER); in SQL*Plus the semicolon on the end is a command to SQL*Plus to execute the preceding statement on the server; it strips off the semicolon before passing it on. SQL Developer does a similar thing. When executing a PL/SQL block, however, the syntax is like so: BEGIN INSERT INTO table1 VALUES (1); INSERT INTO table1 VALUES (2); END; / In this case, the semicolon is accepted by the PL/SQL engine as a statement delimiter, and instead the / is the command to SQL*Plus to execute the current block. This explains the ORA-00911 error we got when trying to run the CREATE TABLE command – the server is complaining about the semicolon on the end. This also means that there is no SQL syntax to execute more than one DDL command in the same OracleCommand. Therefore, we would have to do a round-trip to the server for every command we want to execute. Obviously, this would cause lots of network traffic and be very slow on slow or congested networks. Our first attempt at a solution was to wrap every SQL statement (without semicolon) inside an EXECUTE IMMEDIATE command in a PL/SQL block and pass that to the server to execute. One downside of this solution is that we get no feedback as to how the script execution is going; we're currently evaluating better solutions to this thorny issue. Next up: Dependencies; how we solved the problem of being unable to register the entire database, and the knock-on effects to the whole product.

Read the article
.NET Security Part 4

- by Simon Cooper

Finally, in this series, I am going to cover some of the security issues that can trip you up when using sandboxed appdomains. DISCLAIMER: I am not a security expert, and this is by no means an exhaustive list. If you actually are writing security-critical code, then get a proper security audit of your code by a professional. The examples below are just illustrations of the sort of things that can go wrong. 1. AppDomainSetup.ApplicationBase The most obvious one is the issue covered in the MSDN documentation on creating a sandbox, in step 3 – the sandboxed appdomain has the same ApplicationBase as the controlling appdomain. So let’s explore what happens when they are the same, and an exception is thrown. In the sandboxed assembly, Sandboxed.dll (IPlugin is an interface in a partially-trusted assembly, with a single MethodToDoThings on it): public class UntrustedPlugin : MarshalByRefObject, IPlugin { // implements IPlugin.MethodToDoThings() public void MethodToDoThings() { throw new EvilException(); } } [Serializable] internal class EvilException : Exception { public override string ToString() { // show we have read access to C:\Windows // read the first 5 directories Console.WriteLine("Pwned! Mwuahahah!"); foreach (var d in Directory.EnumerateDirectories(@"C:\Windows").Take(5)) { Console.WriteLine(d.FullName); } return base.ToString(); } } And in the controlling assembly: // what can possibly go wrong? AppDomainSetup appDomainSetup = new AppDomainSetup { ApplicationBase = AppDomain.CurrentDomain.SetupInformation.ApplicationBase } // only grant permissions to execute // and to read the application base, nothing else PermissionSet restrictedPerms = new PermissionSet(PermissionState.None); restrictedPerms.AddPermission( new SecurityPermission(SecurityPermissionFlag.Execution)); restrictedPerms.AddPermission( new FileIOPermission(FileIOPermissionAccess.Read, appDomainSetup.ApplicationBase); restrictedPerms.AddPermission( new FileIOPermission(FileIOPermissionAccess.pathDiscovery, appDomainSetup.ApplicationBase); // create the sandbox AppDomain sandbox = AppDomain.CreateDomain("Sandbox", null, appDomainSetup, restrictedPerms); // execute UntrustedPlugin in the sandbox // don't crash the application if the sandbox throws an exception IPlugin o = (IPlugin)sandbox.CreateInstanceFromAndUnwrap("Sandboxed.dll", "UntrustedPlugin"); try { o.MethodToDoThings() } catch (Exception e) { Console.WriteLine(e.ToString()); } And the result? Oops. We’ve allowed a class that should be sandboxed to execute code with fully-trusted permissions! How did this happen? Well, the key is the exact meaning of the ApplicationBase property: The application base directory is where the assembly manager begins probing for assemblies. When EvilException is thrown, it propagates from the sandboxed appdomain into the controlling assembly’s appdomain (as it’s marked as Serializable). When the exception is deserialized, the CLR finds and loads the sandboxed dll into the fully-trusted appdomain. Since the controlling appdomain’s ApplicationBase directory contains the sandboxed assembly, the CLR finds and loads the assembly into a full-trust appdomain, and the evil code is executed. So the problem isn’t exactly that the sandboxed appdomain’s ApplicationBase is the same as the controlling appdomain’s, it’s that the sandboxed dll was in such a place that the controlling appdomain could find it as part of the standard assembly resolution mechanism. The sandbox then forced the assembly to load in the controlling appdomain by throwing a serializable exception that propagated outside the sandbox. The easiest fix for this is to keep the sandbox ApplicationBase well away from the ApplicationBase of the controlling appdomain, and don’t allow the sandbox permissions to access the controlling appdomain’s ApplicationBase directory. If you do this, then the sandboxed assembly can’t be accidentally loaded into the fully-trusted appdomain, and the code can’t be executed. If the plugin does try to induce the controlling appdomain to load an assembly it shouldn’t, a SerializationException will be thrown when it tries to load the assembly to deserialize the exception, and no damage will be done. 2. Loading the sandboxed dll into the application appdomain As an extension of the previous point, you shouldn’t directly reference types or methods in the sandboxed dll from your application code. That loads the assembly into the fully-trusted appdomain, and from there code in the assembly could be executed. Instead, pull out methods you want the sandboxed dll to have into an interface or class in a partially-trusted assembly you control, and execute methods via that instead (similar to the example above with the IPlugin interface). If you need to have a look at the assembly before executing it in the sandbox, either examine the assembly using reflection from within the sandbox, or load the assembly into the Reflection-only context in the application’s appdomain. The code in assemblies in the reflection-only context can’t be executed, it can only be reflected upon, thus protecting your appdomain from malicious code. 3. Incorrectly asserting permissions You should only assert permissions when you are absolutely sure they’re safe. For example, this method allows a caller read-access to any file they call this method with, including your documents, any network shares, the C:\Windows directory, etc: [SecuritySafeCritical] public static string GetFileText(string filePath) { new FileIOPermission(FileIOPermissionAccess.Read, filePath).Assert(); return File.ReadAllText(filePath); } Be careful when asserting permissions, and ensure you’re not providing a loophole sandboxed dlls can use to gain access to things they shouldn’t be able to. Conclusion Hopefully, that’s given you an idea of some of the ways it’s possible to get past the .NET security system. As I said before, this post is not exhaustive, and you certainly shouldn’t base any security-critical applications on the contents of this blog post. What this series should help with is understanding the possibilities of the security system, and what all the security attributes and classes mean and what they are used for, if you were to use the security system in the future.

Read the article
Subterranean IL: Exception handling 1

- by Simon Cooper

Today, I'll be starting a look at the Structured Exception Handling mechanism within the CLR. Exception handling is quite a complicated business, and, as a result, the rules governing exception handling clauses in IL are quite strict; you need to be careful when writing exception clauses in IL. Exception handlers Exception handlers are specified using a .try clause within a method definition. .try <TryStartLabel> to <TryEndLabel> <HandlerType> handler <HandlerStartLabel> to <HandlerEndLabel> As an example, a basic try/catch block would be specified like so: TryBlockStart: // ... leave.s CatchBlockEndTryBlockEnd:CatchBlockStart: // at the start of a catch block, the exception thrown is on the stack callvirt instance string [mscorlib]System.Object::ToString() call void [mscorlib]System.Console::WriteLine(string) leave.s CatchBlockEnd CatchBlockEnd: // method code continues... .try TryBlockStart to TryBlockEnd catch [mscorlib]System.Exception handler CatchBlockStart to CatchBlockEnd There are four different types of handler that can be specified: catch <TypeToken> This is the standard exception catch clause; you specify the object type that you want to catch (for example, [mscorlib]System.ArgumentException). Any object can be thrown as an exception, although Microsoft recommend that only classes derived from System.Exception are thrown as exceptions. filter <FilterLabel> A filter block allows you to provide custom logic to determine if a handler block should be run. This functionality is exposed in VB, but not in C#. finally A finally block executes when the try block exits, regardless of whether an exception was thrown or not. fault This is similar to a finally block, but a fault block executes only if an exception was thrown. This is not exposed in VB or C#. You can specify multiple catch or filter handling blocks in each .try, but fault and finally handlers must have their own .try clause. We'll look into why this is in later posts. Scoped exception handlers The .try syntax is quite tricky to use; it requires multiple labels, and you've got to be careful to keep separate the different exception handling sections. However, starting from .NET 2, IL allows you to use scope blocks to specify exception handlers instead. Using this syntax, the example above can be written like so: .try { // ... leave.s EndSEH}catch [mscorlib]System.Exception { callvirt instance string [mscorlib]System.Object::ToString() call void [mscorlib]System.Console::WriteLine(string) leave.s EndSEH}EndSEH:// method code continues... As you can see, this is much easier to write (and read!) than a stand-alone .try clause. Next time, I'll be looking at some of the restrictions imposed by SEH on control flow, and how the C# compiler generated exception handling clauses.

Read the article
Developing Schema Compare for Oracle (Part 2): Dependencies

- by Simon Cooper

In developing Schema Compare for Oracle, one of the issues we came across was the size of the databases. As detailed in my last blog post, we had to allow schema pre-filtering due to the number of objects in a standard Oracle database. Unfortunately, this leads to some quite tricky situations regarding object dependencies. This post explains how we deal with these dependencies. 1. Cross-schema dependencies Say, in the following database, you're populating SchemaA, and synchronizing SchemaA.Table1: SOURCE TARGET CREATE TABLE SchemaA.Table1 ( Col1 NUMBER REFERENCES SchemaB.Table1(Col1)); CREATE TABLE SchemaA.Table1 ( Col1 VARCHAR2(100) REFERENCES SchemaB.Table1(Col1)); CREATE TABLE SchemaB.Table1 ( Col1 NUMBER PRIMARY KEY); CREATE TABLE SchemaB.Table1 ( Col1 VARCHAR2(100) PRIMARY KEY); We need to do a rebuild of SchemaA.Table1 to change Col1 from a VARCHAR2(100) to a NUMBER. This consists of: Creating a table with the new schema Inserting data from the old table to the new table, with appropriate conversion functions (in this case, TO_NUMBER) Dropping the old table Rename new table to same name as old table Unfortunately, in this situation, the rebuild will fail at step 1, as we're trying to create a NUMBER column with a foreign key reference to a VARCHAR2(100) column. As we're only populating SchemaA, the naive implementation of the object population prefiltering (sticking a WHERE owner = 'SCHEMAA' on all the data dictionary queries) will generate an incorrect sync script. What we actually have to do is: Drop foreign key constraint on SchemaA.Table1 Rebuild SchemaB.Table1 Rebuild SchemaA.Table1, adding the foreign key constraint to the new table This means that in order to generate a correct synchronization script for SchemaA.Table1 we have to know what SchemaB.Table1 is, and that it also needs to be rebuilt to successfully rebuild SchemaA.Table1. SchemaB isn't the schema that the user wants to synchronize, but we still have to load the table and column information for SchemaB.Table1 the same way as any table in SchemaA. Fortunately, Oracle provides (mostly) complete dependency information in the dictionary views. Before we actually read the information on all the tables and columns in the database, we can get dependency information on all the objects that are either pointed at by objects in the schemas we’re populating, or point to objects in the schemas we’re populating (think about what would happen if SchemaB was being explicitly populated instead), with a suitable query on all_constraints (for foreign key relationships) and all_dependencies (for most other types of dependencies eg a function using another function). The extra objects found can then be included in the actual object population, and the sync wizard then has enough information to figure out the right thing to do when we get to actually synchronize the objects. Unfortunately, this isn’t enough. 2. Dependency chains The solution above will only get the immediate dependencies of objects in populated schemas. What if there’s a chain of dependencies? A.tbl1 -> B.tbl1 -> C.tbl1 -> D.tbl1 If we’re only populating SchemaA, the implementation above will only include B.tbl1 in the dependent objects list, whereas we might need to know about C.tbl1 and D.tbl1 as well, in order to ensure a modification on A.tbl1 can succeed. What we actually need is a graph traversal on the dependency graph that all_dependencies represents. Fortunately, we don’t have to read all the database dependency information from the server and run the graph traversal on the client computer, as Oracle provides a method of doing this in SQL – CONNECT BY. So, we can put all the dependencies we want to include together in big bag with UNION ALL, then run a SELECT ... CONNECT BY on it, starting with objects in the schema we’re populating. We should end up with all the objects that might be affected by modifications in the initial schema we’re populating. Good solution? Well, no. For one thing, it’s sloooooow. all_dependencies, on my test databases, has got over 110,000 rows in it, and the entire query, for which Oracle was creating a temporary table to hold the big bag of graph edges, was often taking upwards of two minutes. This is too long, and would only get worse for large databases. But it had some more fundamental problems than just performance. 3. Comparison dependencies Consider the following schema: SOURCE TARGET CREATE TABLE SchemaA.Table1 ( Col1 NUMBER REFERENCES SchemaB.Table1(col1)); CREATE TABLE SchemaA.Table1 ( Col1 VARCHAR2(100)); CREATE TABLE SchemaB.Table1 ( Col1 NUMBER PRIMARY KEY); CREATE TABLE SchemaB.Table1 ( Col1 VARCHAR2(100)); What will happen if we used the dependency algorithm above on the source & target database? Well, SchemaA.Table1 has a foreign key reference to SchemaB.Table1, so that will be included in the source database population. On the target, SchemaA.Table1 has no such reference. Therefore SchemaB.Table1 will not be included in the target database population. In the resulting comparison of the two objects models, what you will end up with is: SOURCE TARGET SchemaA.Table1 -> SchemaA.Table1 SchemaB.Table1 -> (no object exists) When this comparison is synchronized, we will see that SchemaB.Table1 does not exist, so we will try the following sequence of actions: Create SchemaB.Table1 Rebuild SchemaA.Table1, with foreign key to SchemaB.Table1 Oops. Because the dependencies are only followed within a single database, we’ve tried to create an object that already exists. To fix this we can include any objects found as dependencies in the source or target databases in the object population of both databases. SchemaB.Table1 will then be included in the target database population, and we won’t try and create objects that already exist. All good? Well, consider the following schema (again, only explicitly populating SchemaA, and synchronizing SchemaA.Table1): SOURCE TARGET CREATE TABLE SchemaA.Table1 ( Col1 NUMBER REFERENCES SchemaB.Table1(col1)); CREATE TABLE SchemaA.Table1 ( Col1 VARCHAR2(100)); CREATE TABLE SchemaB.Table1 ( Col1 NUMBER PRIMARY KEY); CREATE TABLE SchemaB.Table1 ( Col1 VARCHAR2(100) PRIMARY KEY); CREATE TABLE SchemaC.Table1 ( Col1 NUMBER); CREATE TABLE SchemaC.Table1 ( Col1 VARCHAR2(100) REFERENCES SchemaB.Table1); Although we’re now including SchemaB.Table1 on both sides of the comparison, there’s a third table (SchemaC.Table1) that we don’t know about that will cause the rebuild of SchemaB.Table1 to fail if we try and synchronize SchemaA.Table1. That’s because we’re only running the dependency query on the schemas we’re explicitly populating; to solve this issue, we would have to run the dependency query again, but this time starting the graph traversal from the objects found in the other database. Furthermore, this dependency chain could be arbitrarily extended.This leads us to the following algorithm for finding all the dependencies of a comparison: Find initial dependencies of schemas the user has selected to compare on the source and target Include these objects in both the source and target object populations Run the dependency query on the source, starting with the objects found as dependents on the target, and vice versa Repeat 2 & 3 until no more objects are found For the schema above, this will result in the following sequence of actions: Find initial dependenciesSchemaA.Table1 -> SchemaB.Table1 found on sourceNo objects found on target Include objects in both source and targetSchemaB.Table1 included in source and target Run dependency query, starting with found objectsNo objects to start with on sourceSchemaB.Table1 -> SchemaC.Table1 found on target Include objects in both source and targetSchemaC.Table1 included in source and target Run dependency query on found objectsNo objects found in sourceNo objects to start with in target Stop This will ensure that we include all the necessary objects to make any synchronization work. However, there is still the issue of query performance; the CONNECT BY on the entire database dependency graph is still too slow. After much sitting down and drawing complicated diagrams, we decided to move the graph traversal algorithm from the server onto the client (which turned out to run much faster on the client than on the server); and to ensure we don’t read the entire dependency graph onto the client we also pull the graph across in bits – we start off with dependency edges involving schemas selected for explicit population, and whenever the graph traversal comes across a dependency reference to a schema we don’t yet know about a thunk is hit that pulls in the dependency information for that schema from the database. We continue passing more dependent objects back and forth between the source and target until no more dependency references are found. This gives us the list of all the extra objects to populate in the source and target, and object population can then proceed. 4. Object blacklists and fast dependencies When we tested this solution, we were puzzled in that in some of our databases most of the system schemas (WMSYS, ORDSYS, EXFSYS, XDB, etc) were being pulled in, and this was increasing the database registration and comparison time quite significantly. After debugging, we discovered that the culprits were database tables that used one of the Oracle PL/SQL types (eg the SDO_GEOMETRY spatial type). These were creating a dependency chain from the database tables we were populating to the system schemas, and hence pulling in most of the system objects in that schema. To solve this we introduced blacklists of objects we wouldn’t follow any dependency chain through. As well as the Oracle-supplied PL/SQL types (MDSYS.SDO_GEOMETRY, ORDSYS.SI_COLOR, among others) we also decided to blacklist the entire PUBLIC and SYS schemas, as any references to those would likely lead to a blow up in the dependency graph that would massively increase the database registration time, and could result in the client running out of memory. Even with these improvements, each dependency query was taking upwards of a minute. We discovered from Oracle execution plans that there were some columns, with dependency information we required, that were querying system tables with no indexes on them! To cut a long story short, running the following query: SELECT * FROM all_tab_cols WHERE data_type_owner = ‘XDB’; results in a full table scan of the SYS.COL$ system table! This single clause was responsible for over half the execution time of the dependency query. Hence, the ‘Ignore slow dependencies’ option was born – not querying this and a couple of similar clauses to drastically speed up the dependency query execution time, at the expense of producing incorrect sync scripts in rare edge cases. Needless to say, along with the sync script action ordering, the dependency code in the database registration is one of the most complicated and most rewritten parts of the Schema Compare for Oracle engine. The beta of Schema Compare for Oracle is out now; if you find a bug in it, please do tell us so we can get it fixed!

Read the article
Subterranean IL: The ThreadLocal type

- by Simon Cooper

I came across ThreadLocal<T> while I was researching ConcurrentBag. To look at it, it doesn't really make much sense. What's all those extra Cn classes doing in there? Why is there a GenericHolder<T,U,V,W> class? What's going on? However, digging deeper, it's a rather ingenious solution to a tricky problem. Thread statics Declaring that a variable is thread static, that is, values assigned and read from the field is specific to the thread doing the reading, is quite easy in .NET: [ThreadStatic] private static string s_ThreadStaticField; ThreadStaticAttribute is not a pseudo-custom attribute; it is compiled as a normal attribute, but the CLR has in-built magic, activated by that attribute, to redirect accesses to the field based on the executing thread's identity. TheadStaticAttribute provides a simple solution when you want to use a single field as thread-static. What if you want to create an arbitary number of thread static variables at runtime? Thread-static fields can only be declared, and are fixed, at compile time. Prior to .NET 4, you only had one solution - thread local data slots. This is a lesser-known function of Thread that has existed since .NET 1.1: LocalDataStoreSlot threadSlot = Thread.AllocateNamedDataSlot("slot1"); string value = "foo"; Thread.SetData(threadSlot, value); string gettedValue = (string)Thread.GetData(threadSlot); Each instance of LocalStoreDataSlot mediates access to a single slot, and each slot acts like a separate thread-static field. As you can see, using thread data slots is quite cumbersome. You need to keep track of LocalDataStoreSlot objects, it's not obvious how instances of LocalDataStoreSlot correspond to individual thread-static variables, and it's not type safe. It's also relatively slow and complicated; the internal implementation consists of a whole series of classes hanging off a single thread-static field in Thread itself, using various arrays, lists, and locks for synchronization. ThreadLocal<T> is far simpler and easier to use. ThreadLocal ThreadLocal provides an abstraction around thread-static fields that allows it to be used just like any other class; it can be used as a replacement for a thread-static field, it can be used in a List<ThreadLocal<T>>, you can create as many as you need at runtime. So what does it do? It can't just have an instance-specific thread-static field, because thread-static fields have to be declared as static, and so shared between all instances of the declaring type. There's something else going on here. The values stored in instances of ThreadLocal<T> are stored in instantiations of the GenericHolder<T,U,V,W> class, which contains a single ThreadStatic field (s_value) to store the actual value. This class is then instantiated with various combinations of the Cn types for generic arguments. In .NET, each separate instantiation of a generic type has its own static state. For example, GenericHolder<int,C0,C1,C2> has a completely separate s_value field to GenericHolder<int,C1,C14,C1>. This feature is (ab)used by ThreadLocal to emulate instance thread-static fields. Every time an instance of ThreadLocal is constructed, it is assigned a unique number from the static s_currentTypeId field using Interlocked.Increment, in the FindNextTypeIndex method. The hexadecimal representation of that number then defines the specific Cn types that instantiates the GenericHolder class. That instantiation is therefore 'owned' by that instance of ThreadLocal. This gives each instance of ThreadLocal its own ThreadStatic field through a specific unique instantiation of the GenericHolder class. Although GenericHolder has four type variables, the first one is always instantiated to the type stored in the ThreadLocal<T>. This gives three free type variables, each of which can be instantiated to one of 16 types (C0 to C15). This puts an upper limit of 4096 (163) on the number of ThreadLocal<T> instances that can be created for each value of T. That is, there can be a maximum of 4096 instances of ThreadLocal<string>, and separately a maximum of 4096 instances of ThreadLocal<object>, etc. However, there is an upper limit of 16384 enforced on the total number of ThreadLocal instances in the AppDomain. This is to stop too much memory being used by thousands of instantiations of GenericHolder<T,U,V,W>, as once a type is loaded into an AppDomain it cannot be unloaded, and will continue to sit there taking up memory until the AppDomain is unloaded. The total number of ThreadLocal instances created is tracked by the ThreadLocalGlobalCounter class. So what happens when either limit is reached? Firstly, to try and stop this limit being reached, it recycles GenericHolder type indexes of ThreadLocal instances that get disposed using the s_availableIndices concurrent stack. This allows GenericHolder instantiations of disposed ThreadLocal instances to be re-used. But if there aren't any available instantiations, then ThreadLocal falls back on a standard thread local slot using TLSHolder. This makes it very important to dispose of your ThreadLocal instances if you'll be using lots of them, so the type instantiations can be recycled. The previous way of creating arbitary thread-static variables, thread data slots, was slow, clunky, and hard to use. In comparison, ThreadLocal can be used just like any other type, and each instance appears from the outside to be a non-static thread-static variable. It does this by using the CLR type system to assign each instance of ThreadLocal its own instantiated type containing a thread-static field, and so delegating a lot of the bookkeeping that thread data slots had to do to the CLR type system itself! That's a very clever use of the CLR type system.

Read the article
Inside the DLR – Invoking methods

- by Simon Cooper

So, we’ve looked at how a dynamic call is represented in a compiled assembly, and how the dynamic lookup is performed at runtime. The last piece of the puzzle is how the resolved method gets invoked, and that is the subject of this post. Invoking methods As discussed in my previous posts, doing a full lookup and bind at runtime each and every single time the callsite gets invoked would be far too slow to be usable. The results obtained from the callsite binder must to be cached, along with a series of conditions to determine whether the cached result can be reused. So, firstly, how are the conditions represented? These conditions can be anything; they are determined entirely by the semantics of the language the binder is representing. The binder has to be able to return arbitary code that is then executed to determine whether the conditions apply or not. Fortunately, .NET 4 has a neat way of representing arbitary code that can be easily combined with other code – expression trees. All the callsite binder has to return is an expression (called a ‘restriction’) that evaluates to a boolean, returning true when the restriction passes (indicating the corresponding method invocation can be used) and false when it does’t. If the bind result is also represented in an expression tree, these can be combined easily like so: if ([restriction is true]) { [invoke cached method] } Take my example from my previous post: public class ClassA { public static void TestDynamic() { CallDynamic(new ClassA(), 10); CallDynamic(new ClassA(), "foo"); } public static void CallDynamic(dynamic d, object o) { d.Method(o); } public void Method(int i) {} public void Method(string s) {} } When the Method(int) method is first bound, along with an expression representing the result of the bind lookup, the C# binder will return the restrictions under which that bind can be reused. In this case, it can be reused if the types of the parameters are the same: if (thisArg.GetType() == typeof(ClassA) && arg1.GetType() == typeof(int)) { thisClassA.Method(i); } Caching callsite results So, now, it’s up to the callsite to link these expressions returned from the binder together in such a way that it can determine which one from the many it has cached it should use. This caching logic is all located in the System.Dynamic.UpdateDelegates class. It’ll help if you’ve got this type open in a decompiler to have a look yourself. For each callsite, there are 3 layers of caching involved: The last method invoked on the callsite. All methods that have ever been invoked on the callsite. All methods that have ever been invoked on any callsite of the same type. We’ll cover each of these layers in order Level 1 cache: the last method called on the callsite When a CallSite<T> object is first instantiated, the Target delegate field (containing the delegate that is called when the callsite is invoked) is set to one of the UpdateAndExecute generic methods in UpdateDelegates, corresponding to the number of parameters to the callsite, and the existance of any return value. These methods contain most of the caching, invoke, and binding logic for the callsite. The first time this method is invoked, the UpdateAndExecute method finds there aren’t any entries in the caches to reuse, and invokes the binder to resolve a new method. Once the callsite has the result from the binder, along with any restrictions, it stitches some extra expressions in, and replaces the Target field in the callsite with a compiled expression tree similar to this (in this example I’m assuming there’s no return value): if ([restriction is true]) { [invoke cached method] return; } if (callSite._match) { _match = false; return; } else { UpdateAndExecute(callSite, arg0, arg1, ...); } Woah. What’s going on here? Well, this resulting expression tree is actually the first level of caching. The Target field in the callsite, which contains the delegate to call when the callsite is invoked, is set to the above code compiled from the expression tree into IL, and then into native code by the JIT. This code checks whether the restrictions of the last method that was invoked on the callsite (the ‘primary’ method) match, and if so, executes that method straight away. This means that, the next time the callsite is invoked, the first code that executes is the restriction check, executing as native code! This makes this restriction check on the primary cached delegate very fast. But what if the restrictions don’t match? In that case, the second part of the stitched expression tree is executed. What this section should be doing is calling back into the UpdateAndExecute method again to resolve a new method. But it’s slightly more complicated than that. To understand why, we need to understand the second and third level caches. Level 2 cache: all methods that have ever been invoked on the callsite When a binder has returned the result of a lookup, as well as updating the Target field with a compiled expression tree, stitched together as above, the callsite puts the same compiled expression tree in an internal list of delegates, called the rules list. This list acts as the level 2 cache. Why use the same delegate? Stitching together expression trees is an expensive operation. You don’t want to do it every time the callsite is invoked. Ideally, you would create one expression tree from the binder’s result, compile it, and then use the resulting delegate everywhere in the callsite. But, if the same delegate is used to invoke the callsite in the first place, and in the caches, that means each delegate needs two modes of operation. An ‘invoke’ mode, for when the delegate is set as the value of the Target field, and a ‘match’ mode, used when UpdateAndExecute is searching for a method in the callsite’s cache. Only in the invoke mode would the delegate call back into UpdateAndExecute. In match mode, it would simply return without doing anything. This mode is controlled by the _match field in CallSite<T>. The first time the callsite is invoked, _match is false, and so the Target delegate is called in invoke mode. Then, if the initial restriction check fails, the Target delegate calls back into UpdateAndExecute. This method sets _match to true, then calls all the cached delegates in the rules list in match mode to try and find one that passes its restrictions, and invokes it. However, there needs to be some way for each cached delegate to inform UpdateAndExecute whether it passed its restrictions or not. To do this, as you can see above, it simply re-uses _match, and sets it to false if it did not pass the restrictions. This allows the code within each UpdateAndExecute method to check for cache matches like so: foreach (T cachedDelegate in Rules) { callSite._match = true; cachedDelegate(); // sets _match to false if restrictions do not pass if (callSite._match) { // passed restrictions, and the cached method was invoked // set this delegate as the primary target to invoke next time callSite.Target = cachedDelegate; return; } // no luck, try the next one... } Level 3 cache: all methods that have ever been invoked on any callsite with the same signature The reason for this cache should be clear – if a method has been invoked through a callsite in one place, then it is likely to be invoked on other callsites in the codebase with the same signature. Rather than living in the callsite, the ‘global’ cache for callsite delegates lives in the CallSiteBinder class, in the Cache field. This is a dictionary, typed on the callsite delegate signature, providing a RuleCache<T> instance for each delegate signature. This is accessed in the same way as the level 2 callsite cache, by the UpdateAndExecute methods. When a method is matched in the global cache, it is copied into the callsite and Target cache before being executed. Putting it all together So, how does this all fit together? Like so (I’ve omitted some implementation & performance details): That, in essence, is how the DLR performs its dynamic calls nearly as fast as statically compiled IL code. Extensive use of expression trees, compiled to IL and then into native code. Multiple levels of caching, the first of which executes immediately when the dynamic callsite is invoked. And a clever re-use of compiled expression trees that can be used in completely different contexts without being recompiled. All in all, a very fast and very clever reflection caching mechanism.

Read the article
Why enumerator structs are a really bad idea (redux)

- by Simon Cooper

My previous blog post went into some detail as to why calling MoveNext on a BCL generic collection enumerator didn't quite do what you thought it would. This post covers the Reset method. To recap, here's the simple wrapper around a linked list enumerator struct from my previous post (minus the readonly on the enumerator variable): sealed class EnumeratorWrapper : IEnumerator<int> { private LinkedList<int>.Enumerator m_Enumerator; public EnumeratorWrapper(LinkedList<int> linkedList) { m_Enumerator = linkedList.GetEnumerator(); } public int Current { get { return m_Enumerator.Current; } } object System.Collections.IEnumerator.Current { get { return Current; } } public bool MoveNext() { return m_Enumerator.MoveNext(); } public void Reset() { ((System.Collections.IEnumerator)m_Enumerator).Reset(); } public void Dispose() { m_Enumerator.Dispose(); } } If you have a look at the Reset method, you'll notice I'm having to cast to IEnumerator to be able to call Reset on m_Enumerator. This is because the implementation of LinkedList<int>.Enumerator.Reset, and indeed of all the other Reset methods on the BCL generic collection enumerators, is an explicit interface implementation. However, IEnumerator is a reference type. LinkedList<int>.Enumerator is a value type. That means, in order to call the reset method at all, the enumerator has to be boxed. And the IL confirms this: .method public hidebysig newslot virtual final instance void Reset() cil managed { .maxstack 8 L_0000: nop L_0001: ldarg.0 L_0002: ldfld valuetype [System]System.Collections.Generic.LinkedList`1/Enumerator<int32> EnumeratorWrapper::m_Enumerator L_0007: box [System]System.Collections.Generic.LinkedList`1/Enumerator<int32> L_000c: callvirt instance void [mscorlib]System.Collections.IEnumerator::Reset() L_0011: nop L_0012: ret } On line 0007, we're doing a box operation, which copies the enumerator to a reference object on the heap, then on line 000c calling Reset on this boxed object. So m_Enumerator in the wrapper class is not modified by the call the Reset. And this is the only way to call the Reset method on this variable (without using reflection). Therefore, the only way that the collection enumerator struct can be used safely is to store them as a boxed IEnumerator<T>, and not use them as value types at all.

Read the article
Headaches using distributed version control for traditional teams?

- by J Cooper

Though I use and like DVCS for my personal projects, and can totally see how it makes managing contributions to your project from others easier (e.g. your typical Github scenario), it seems like for a "traditional" team there could be some problems over the centralized approach employed by solutions like TFS, Perforce, etc. (By "traditional" I mean a team of developers in an office working on one project that no one person "owns", with potentially everyone touching the same code.) A couple of these problems I've foreseen on my own, but please chime in with other considerations. In a traditional system, when you try to check your change in to the server, if someone else has previously checked in a conflicting change then you are forced to merge before you can check yours in. In the DVCS model, each developer checks in their changes locally and at some point pushes to some other repo. That repo then has a branch of that file that 2 people changed. It seems that now someone must be put in charge of dealing with that situation. A designated person on the team might not have sufficient knowledge of the entire codebase to be able to handle merging all conflicts. So now an extra step has been added where someone has to approach one of those developers, tell him to pull and do the merge and then push again (or you have to build an infrastructure that automates that task). Furthermore, since DVCS tends to make working locally so convenient, it is probable that developers could accumulate a few changes in their local repos before pushing, making such conflicts more common and more complicated. Obviously if everyone on the team only works on different areas of the code, this isn't an issue. But I'm curious about the case where everyone is working on the same code. It seems like the centralized model forces conflicts to be dealt with quickly and frequently, minimizing the need to do large, painful merges or have anyone "police" the main repo. So for those of you who do use a DVCS with your team in your office, how do you handle such cases? Do you find your daily (or more likely, weekly) workflow affected negatively? Are there any other considerations I should be aware of before recommending a DVCS at my workplace?

Read the article
Developing Schema Compare for Oracle (Part 5): Query Snapshots

- by Simon Cooper

If you've emailed us about a bug you've encountered with the EAP or beta versions of Schema Compare for Oracle, we probably asked you to send us a query snapshot of your databases. Here, I explain what a query snapshot is, and how it helps us fix your bug. Problem 1: Debugging users' bug reports When we started the Schema Compare project, we knew we were going to get problems with users' databases - configurations we hadn't considered, features that weren't installed, unicode issues, wierd dependencies... With SQL Compare, users are generally happy to send us a database backup that we can restore using a single RESTORE DATABASE command on our test servers and immediately reproduce the problem. Oracle, on the other hand, would be a lot more tricky. As Oracle generally has a 1-to-1 mapping between instances and databases, any databases users sent would have to be restored to their own instance. Furthermore, the number of steps required to get a properly working database, and the size of most oracle databases, made it infeasible to ask every customer who came across a bug during our beta program to send us their databases. We also knew that there would be lots of issues with data security that would make it hard to get backups. So we needed an easier way to be able to debug customers issues and sort out what strange schema data Oracle was returning. Problem 2: Test execution time Another issue we knew we would have to solve was the execution time of the tests we would produce for the Schema Compare engine. Our initial prototype showed that querying the data dictionary for schema information was going to be slow (at least 15 seconds per database), and this is generally proportional to the size of the database. If you're running thousands of tests on the same databases, each one registering separate schemas, not only would the tests would take hours and hours to run, but the test servers would be hammered senseless. The solution To solve these, we needed to be able to populate the schema of a database without actually connecting to it. Well, the IDataReader interface is the primary way we read data from an Oracle server. The data dictionary queries we use return their data in terms of simple strings and numbers, which we then process and reconstruct into an object model, and the results of these queries are identical for identical schemas. So, we can record the raw results of the queries once, and then replay these results to construct the same object model as many times as required without needing to actually connect to the original database. This is what query snapshots do. They are binary files containing the raw unprocessed data we get back from the oracle server for all the queries we run on the data dictionary to get schema information. The core of the query snapshot generation takes the results of the IDataReader we get from running queries on Oracle, and passes the row data to a BinaryWriter that writes it straight to a file. The query snapshot can then be replayed to create the same object model; when the results of a specific query is needed by the population code, we can simply read the binary data stored in the file on disk and present it through an IDataReader wrapper. This is far faster than querying the server over the network, and allows us to run tests in a reasonable time. They also allow us to easily debug a customers problem; using a simple snapshot generation program, users can generate a query snapshot that could be sent along with a bug report that we can immediately replay on our machines to let us debug the issue, rather than having to obtain database backups and restore databases to test systems. There are also far fewer problems with data security; query snapshots only contain schema information, which is generally less sensitive than table data. Query snapshots implementation However, actually implementing such a feature did have a couple of 'gotchas' to it. My second blog post detailed the development of the dependencies algorithm we use to ensure we get all the dependencies in the database, and that algorithm uses data from both databases to find all the needed objects - what database you're comparing to affects what objects get populated from both databases. We get information on these additional objects using an appropriate WHERE clause on all the population queries. So, in order to accurately replay the results of querying the live database, the query snapshot needs to be a snapshot of a comparison of two databases, not just populating a single database. Furthermore, although the code population queries (eg querying all_tab_cols to get column information) can simply be passed straight from the IDataReader to the BinaryWriter, we need to hook into and run the live dependencies algorithm while we're creating the snapshot to ensure we get the same WHERE clauses, and the same query results, as if we were populating straight from a live system. We also need to store the results of the dependencies queries themselves, as the resulting dependency graph is stored within the OracleDatabase object that is produced, and is later used to help order actions in synchronization scripts. This is significantly helped by the dependencies algorithm being a deterministic algorithm - given the same input, it will always return the same output. Therefore, when we're replaying a query snapshot, and processing dependency information, we simply have to return the results of the queries in the order we got them from the live database, rather than trying to calculate the contents of all_dependencies on the fly. Query snapshots are a significant feature in Schema Compare that really helps us to debug problems with the tool, as well as making our testers happier. Although not really user-visible, they are very useful to the development team to help us fix bugs in the product much faster than we otherwise would be able to.

Read the article
What does RESTful web applications mean? [closed]

- by John Cooper

Possible Duplicate: What is REST (in simple English) What does RESTful web applications mean? A web service is a function that can be accessed by other programs over the web (Http). To clarify a bit, when you create a website in PHP that outputs HTML its target is the browser and by extension the human being reading the page in the browser. A web service is not targeted at humans but rather at other programs. SOAP and REST are two ways of creating WebServices. Correct me if i am wrong? What are other ways i can create a WebService? What does it mean fully RESTful web Application?

Read the article
Handling SMS/email convergence: how does a good business app do it?

- by Tim Cooper

I'm writing a school administration software package, but it strikes me that many developers will face this same issue: when communicating with users, should you use email or SMS or both, and should you treat them as fundamentally equivalent channels such that any message can get sent using any media, (with long and short forms of the message template obviously) or should different business functions be specifically tailored to each of the 3? This question got kicked off "StackOverflow" for being overly general, so I'm hoping it's not too general for this site - the answers will no doubt be subjective but "you don't need to write a whole book to answer the question". I'm particularly interested in people who have direct experience of having written comparable business applications. Sub-questions: Do I treat SMS as "moderately secure" and email as less secure? (I'm thinking about booking tokens for parent/teacher nights, permission slips for excursions, absence explanation notes - so high security is not a requirement for us, although medium security is) Is it annoying for users to receive the same message on multiple channels? Should we have a unified framework that reports on delivery or lack thereof of emails and SMS's?

Read the article
Subterranean IL: Exception handling 2

- by Simon Cooper

Control flow in and around exception handlers is tightly controlled, due to the various ways the handler blocks can be executed. To start off with, I'll describe what SEH does when an exception is thrown. Handling exceptions When an exception is thrown, the CLR stops program execution at the throw statement and searches up the call stack looking for an appropriate handler; catch clauses are analyzed, and filter blocks are executed (I'll be looking at filter blocks in a later post). Then, when an appropriate catch or filter handler is found, the stack is unwound to that handler, executing successive finally and fault handlers in their own stack contexts along the way, and program execution continues at the start of the catch handler. Because catch, fault, finally and filter blocks can be executed essentially out of the blue by the SEH mechanism, without any reference to preceding instructions, you can't use arbitary branches in and out of exception handler blocks. Instead, you need to use specific instructions for control flow out of handler blocks: leave, endfinally/endfault, and endfilter. Exception handler control flow try blocks You cannot branch into or out of a try block or its handler using normal control flow instructions. The only way of entering a try block is by either falling through from preceding instructions, or by branching to the first instruction in the block. Once you are inside a try block, you can only leave it by throwing an exception or using the leave <label> instruction to jump to somewhere outside the block and its handler. The leave instructions signals the CLR to execute any finally handlers around the block. Most importantly, you cannot fall out of the block, and you cannot use a ret to return from the containing method (unlike in C#); you have to use leave to branch to a ret elsewhere in the method. As a side effect, leave empties the stack. catch blocks The only way of entering a catch block is if it is run by the SEH. At the start of the block execution, the thrown exception will be the only thing on the stack. The only way of leaving a catch block is to use throw, rethrow, or leave, in a similar way to try blocks. However, one thing you can do is use a leave to branch back to an arbitary place in the handler's try block! In other words, you can do this: .try { // ... newobj instance void [mscorlib]System.Exception::.ctor() throw MidTry: // ... leave.s RestOfMethod } catch [mscorlib]System.Exception { // ... leave.s MidTry } RestOfMethod: // ... As far as I know, this mechanism is not exposed in C# or VB. finally/fault blocks The only way of entering a finally or fault block is via the SEH, either as the result of a leave instruction in the corresponding try block, or as part of handling an exception. The only way to leave a finally or fault block is to use endfinally or endfault (both compile to the same binary representation), which continues execution after the finally/fault block, or, if the block was executed as part of handling an exception, signals that the SEH can continue walking the stack. filter blocks I'll be covering filters in a separate blog posts. They're quite different to the others, and have their own special semantics. Phew! Complicated stuff, but it's important to know if you're writing or outputting exception handlers in IL. Dealing with the C# compiler is probably best saved for the next post.

Read the article
Oh no! My padding's invalid!

- by Simon Cooper

Recently, I've been doing some work involving cryptography, and encountered the standard .NET CryptographicException: 'Padding is invalid and cannot be removed.' Searching on StackOverflow produces 57 questions concerning this exception; it's a very common problem encountered. So I decided to have a closer look. To test this, I created a simple project that decrypts and encrypts a byte array: // create some random data byte[] data = new byte[100]; new Random().NextBytes(data); // use the Rijndael symmetric algorithm RijndaelManaged rij = new RijndaelManaged(); byte[] encrypted; // encrypt the data using a CryptoStream using (var encryptor = rij.CreateEncryptor()) using (MemoryStream encryptedStream = new MemoryStream()) using (CryptoStream crypto = new CryptoStream( encryptedStream, encryptor, CryptoStreamMode.Write)) { crypto.Write(data, 0, data.Length); encrypted = encryptedStream.ToArray(); } byte[] decrypted; // and decrypt it again using (var decryptor = rij.CreateDecryptor()) using (CryptoStream crypto = new CryptoStream( new MemoryStream(encrypted), decryptor, CryptoStreamMode.Read)) { byte[] decrypted = new byte[data.Length]; crypto.Read(decrypted, 0, decrypted.Length); } Sure enough, I got exactly the same CryptographicException when trying to decrypt the data even in this simple example. Well, I'm obviously missing something, if I can't even get this single method right! What does the exception message actually mean? What am I missing? Well, after playing around a bit, I discovered the problem was fixed by changing the encryption step to this: // encrypt the data using a CryptoStream using (var encryptor = rij.CreateEncryptor()) using (MemoryStream encryptedStream = new MemoryStream()) { using (CryptoStream crypto = new CryptoStream( encryptedStream, encryptor, CryptoStreamMode.Write)) { crypto.Write(data, 0, data.Length); } encrypted = encryptedStream.ToArray(); } Aaaah, so that's what the problem was. The CryptoStream wasn't flushing all it's data to the MemoryStream before it was being read, and closing the stream causes it to flush everything to the backing stream. But why does this cause an error in padding? Cryptographic padding All symmetric encryption algorithms (of which Rijndael is one) operates on fixed block sizes. For Rijndael, the default block size is 16 bytes. This means the input needs to be a multiple of 16 bytes long. If it isn't, then the input is padded to 16 bytes using one of the padding modes. This is only done to the final block of data to be encrypted. CryptoStream has a special method to flush this final block of data - FlushFinalBlock. Calling Stream.Flush() does not flush the final block, as you might expect. Only by closing the stream or explicitly calling FlushFinalBlock is the final block, with any padding, encrypted and written to the backing stream. Without this call, the encrypted data is 16 bytes shorter than it should be. If this final block wasn't written, then the decryption gets to the final 16 bytes of the encrypted data and tries to decrypt it as the final block with padding. The end bytes don't match the padding scheme it's been told to use, therefore it throws an exception stating what is wrong - what the decryptor expects to be padding actually isn't, and so can't be removed from the stream. So, as well as closing the stream before reading the result, an alternative fix to my encryption code is the following: // encrypt the data using a CryptoStream using (var encryptor = rij.CreateEncryptor()) using (MemoryStream encryptedStream = new MemoryStream()) using (CryptoStream crypto = new CryptoStream( encryptedStream, encryptor, CryptoStreamMode.Write)) { crypto.Write(data, 0, data.Length); // explicitly flush the final block of data crypto.FlushFinalBlock(); encrypted = encryptedStream.ToArray(); } Conclusion So, if your padding is invalid, make sure that you close or call FlushFinalBlock on any CryptoStream performing encryption before you access the encrypted data. Flush isn't enough. Only then will the final block be present in the encrypted data, allowing it to be decrypted successfully.

Read the article
Developing Schema Compare for Oracle (Part 6): 9i Query Performance

- by Simon Cooper

All throughout the EAP and beta versions of Schema Compare for Oracle, our main request was support for Oracle 9i. After releasing version 1.0 with support for 10g and 11g, our next step was then to get version 1.1 of SCfO out with support for 9i. However, there were some significant problems that we had to overcome first. This post will concentrate on query execution time. When we first tested SCfO on a 9i server, after accounting for various changes to the data dictionary, we found that database registration was taking a long time. And I mean a looooooong time. The same database that on 10g or 11g would take a couple of minutes to register would be taking upwards of 30 mins on 9i. Obviously, this is not ideal, so a poke around the query execution plans was required. As an example, let's take the table population query - the one that reads ALL_TABLES and joins it with a few other dictionary views to get us back our list of tables. On 10g, this query takes 5.6 seconds. On 9i, it takes 89.47 seconds. The difference in execution plan is even more dramatic - here's the (edited) execution plan on 10g: -------------------------------------------------------------------------------| Id | Operation | Name | Bytes | Cost |-------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 108K| 939 || 1 | SORT ORDER BY | | 108K| 939 || 2 | NESTED LOOPS OUTER | | 108K| 938 ||* 3 | HASH JOIN RIGHT OUTER | | 103K| 762 || 4 | VIEW | ALL_EXTERNAL_LOCATIONS | 2058 | 3 ||* 20 | HASH JOIN RIGHT OUTER | | 73472 | 759 || 21 | VIEW | ALL_EXTERNAL_TABLES | 2097 | 3 ||* 34 | HASH JOIN RIGHT OUTER | | 39920 | 755 || 35 | VIEW | ALL_MVIEWS | 51 | 7 || 58 | NESTED LOOPS OUTER | | 39104 | 748 || 59 | VIEW | ALL_TABLES | 6704 | 668 || 89 | VIEW PUSHED PREDICATE | ALL_TAB_COMMENTS | 2025 | 5 || 106 | VIEW | ALL_PART_TABLES | 277 | 11 |------------------------------------------------------------------------------- And the same query on 9i: -------------------------------------------------------------------------------| Id | Operation | Name | Bytes | Cost |-------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 16P| 55G|| 1 | SORT ORDER BY | | 16P| 55G|| 2 | NESTED LOOPS OUTER | | 16P| 862M|| 3 | NESTED LOOPS OUTER | | 5251G| 992K|| 4 | NESTED LOOPS OUTER | | 4243M| 2578 || 5 | NESTED LOOPS OUTER | | 2669K| 1440 ||* 6 | HASH JOIN OUTER | | 398K| 302 || 7 | VIEW | ALL_TABLES | 342K| 276 || 29 | VIEW | ALL_MVIEWS | 51 | 20 ||* 50 | VIEW PUSHED PREDICATE | ALL_TAB_COMMENTS | 2043 | ||* 66 | VIEW PUSHED PREDICATE | ALL_EXTERNAL_TABLES | 1777K| ||* 80 | VIEW PUSHED PREDICATE | ALL_EXTERNAL_LOCATIONS | 1744K| ||* 96 | VIEW | ALL_PART_TABLES | 852K| |------------------------------------------------------------------------------- Have a look at the cost column. 10g's overall query cost is 939, and 9i is 55,000,000,000 (or more precisely, 55,496,472,769). It's also having to process far more data. What on earth could be causing this huge difference in query cost? After trawling through the '10g New Features' documentation, we found item 1.9.2.21. Before 10g, Oracle advised that you do not collect statistics on data dictionary objects. From 10g, it advised that you do collect statistics on the data dictionary; for our queries, Oracle therefore knows what sort of data is in the dictionary tables, and so can generate an efficient execution plan. On 9i, no statistics are present on the system tables, so Oracle has to use the Rule Based Optimizer, which turns most LEFT JOINs into nested loops. If we force 9i to use hash joins, like 10g, we get a much better plan: -------------------------------------------------------------------------------| Id | Operation | Name | Bytes | Cost |-------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 7587K| 3704 || 1 | SORT ORDER BY | | 7587K| 3704 ||* 2 | HASH JOIN OUTER | | 7587K| 822 ||* 3 | HASH JOIN OUTER | | 5262K| 616 ||* 4 | HASH JOIN OUTER | | 2980K| 465 ||* 5 | HASH JOIN OUTER | | 710K| 432 ||* 6 | HASH JOIN OUTER | | 398K| 302 || 7 | VIEW | ALL_TABLES | 342K| 276 || 29 | VIEW | ALL_MVIEWS | 51 | 20 || 50 | VIEW | ALL_PART_TABLES | 852K| 104 || 78 | VIEW | ALL_TAB_COMMENTS | 2043 | 14 || 93 | VIEW | ALL_EXTERNAL_LOCATIONS | 1744K| 31 || 106 | VIEW | ALL_EXTERNAL_TABLES | 1777K| 28 |------------------------------------------------------------------------------- That's much more like it. This drops the execution time down to 24 seconds. Not as good as 10g, but still an improvement. There are still several problems with this, however. 10g introduced a new join method - a right outer hash join (used in the first execution plan). The 9i query optimizer doesn't have this option available, so forcing a hash join means it has to hash the ALL_TABLES table, and furthermore re-hash it for every hash join in the execution plan; this could be thousands and thousands of rows. And although forcing hash joins somewhat alleviates this problem on our test systems, there's no guarantee that this will improve the execution time on customers' systems; it may even increase the time it takes (say, if all their tables are partitioned, or they've got a lot of materialized views). Ideally, we would want a solution that provides a speedup whatever the input. To try and get some ideas, we asked some oracle performance specialists to see if they had any ideas or tips. Their recommendation was to add a hidden hook into the product that allowed users to specify their own query hints, or even rewrite the queries entirely. However, we would prefer not to take that approach; as well as a lot of new infrastructure & a rewrite of the population code, it would have meant that any users of 9i would have to spend some time optimizing it to get it working on their system before they could use the product. Another approach was needed. All our population queries have a very specific pattern - a base table provides most of the information we need (ALL_TABLES for tables, or ALL_TAB_COLS for columns) and we do a left join to extra subsidiary tables that fill in gaps (for instance, ALL_PART_TABLES for partition information). All the left joins use the same set of columns to join on (typically the object owner & name), so we could re-use the hash information for each join, rather than re-hashing the same columns for every join. To allow us to do this, along with various other performance improvements that could be done for the specific query pattern we were using, we read all the tables individually and do a hash join on the client. Fortunately, this 'pure' algorithmic problem is the kind that can be very well optimized for expected real-world situations; as well as storing row data we're not using in the hash key on disk, we use very specific memory-efficient data structures to store all the information we need. This allows us to achieve a database population time that is as fast as on 10g, and even (in some situations) slightly faster, and a memory overhead of roughly 150 bytes per row of data in the result set (for schemas with 10,000 tables in that means an extra 1.4MB memory being used during population). Next: fun with the 9i dictionary views.

Read the article
.NET vs Windows 8: Rematch!

- by Simon Cooper

So, although you will be able to use your existing .NET skills to develop Metro apps, it turns out Microsoft are limiting Visual Studio 2011 Express to Metro-only. From the Express website: Visual Studio 11 Express for Windows 8 provides tools for Metro style app development. To create desktop apps, you need to use Visual Studio 11 Professional, or higher. Oh dear. To develop any sort of non-Metro application, you will need to pay for at least VS Professional. I suspect Microsoft (or at least, certain groups within Microsoft) have a very explicit strategy in mind. By making VS Express Metro-only, developers who don't want to pay for Professional will be forced to make their simple one-shot or open-source application in Metro. This increases the number of applications available for Windows 8 and Windows mobile devices, which in turn make those platforms more attractive for consumers. When you use the free VS 11 Express, instead of paying Microsoft, you provide them a service by making applications for Metro, which in turn makes Microsoft's mobile offering more attractive to consumers, increasing their market share. Of course, it remains to be seen if developers forced to jump onto the Metro bandwagon will simply jump ship to Android or iOS instead. At least, that's what I think is going on. With Microsoft, who really knows?

Read the article
Developing Schema Compare for Oracle (Part 4): Script Configuration

- by Simon Cooper

If you've had a chance to play around with the Schema Compare for Oracle beta, you may have come across this screen in the synchronization wizard: This screen is one of the few screens that, along with the project configuration form, doesn't come from SQL Compare. This screen was designed to solve a couple of issues that, although aren't specific to Oracle, are much more of a problem than on SQL Server: Datatype conversions and NOT NULL columns. 1. Datatype conversions SQL Server is generally quite forgiving when it comes to datatype conversions using ALTER TABLE. For example, you can convert from a VARCHAR to INT using ALTER TABLE as long as all the character values are parsable as integers. Oracle, on the other hand, only allows ALTER TABLE conversions that don't change the internal data format. Essentially, every change that requires an actual datatype conversion has to be done using a rebuild with a conversion function. That's OK, as we can simply hard-code the various conversion functions for the valid datatype conversions and insert those into the rebuild SELECT list. However, as there always is with Oracle, there's a catch. Have a look at the NUMTODSINTERVAL function. As well as specifying the value (or column) to convert, you have to specify an interval_unit, which tells oracle how to interpret the input number. We can't hardcode a default for this parameter, as it is entirely dependent on the user's data context! So, in order to convert NUMBER to INTERVAL DAY TO SECOND/INTERVAL YEAR TO MONTH, we need to have feedback from the user as to what to put in this parameter while we're generating the sync script - this requires a new step in the engine action/script generation to insert these values into the script, as well as new UI to allow the user to specify these values in a sensible fashion. In implementing the engine and UI infrastructure to allow this it made much more sense to implement it for any rebuild datatype conversion, not just NUMBER to INTERVALs. For conversions which we can do, we pre-fill the 'value' box with the appropriate function from the documentation. The user can also type in arbitary SQL expressions, which allows the user to specify optional format parameters for the relevant conversion functions, or indeed call their own functions to convert between values that don't have a built-in conversion defined. As the value gets inserted as-is into the rebuild SELECT list, any expression that is valid in that context can be specified as the conversion value. 2. NOT NULL columns Another problem that is solved by the new step in the sync wizard is adding a NOT NULL column to a table. If the table contains data (as most database tables do), you can't just add a NOT NULL column, as Oracle doesn't know what value to put in the new column for existing rows - the DDL statement will fail. There are actually 3 separate scenarios for this problem that have separate solutions within the engine: Adding a NOT NULL column to a table without a rebuild Here, the workaround is to add a column default with an appropriate value to the column you're adding: ALTER TABLE tbl1 ADD newcol NUMBER DEFAULT <value> NOT NULL; Note, however, there is something to bear in mind about this solution; once specified on a column, a default cannot be removed. To 'remove' a default from a column you change it to have a default of NULL, hence there's code in the engine to treat a NULL default the same as no default at all. Adding a NOT NULL column to a table, where a separate change forced a table rebuild Fortunately, in this case, a column default is not required - we can simply insert the default value into the rebuild SELECT clause. Changing an existing NULL to a NOT NULL column To implement this, we run an UPDATE command before the ALTER TABLE to change all the NULLs in the column to the required default value. For all three, we need some way of allowing the user to specify a default value to use instead of NULL; as this is essentially the same problem as datatype conversion (inserting values into the sync script), we can re-use the UI and engine implementation of datatype conversion values. We also provide the option to alter the new column to allow NULLs, or to ignore the problem completely. Note that there is the same (long-running) problem in SQL Compare, but it is much more of an issue in Oracle as you cannot easily roll back executed DDL statements if the script fails at some point during execution. Furthermore, the engine of SQL Compare is far less conducive to inserting user-supplied values into the generated script. As we're writing the Schema Compare engine from scratch, we used what we learnt from the SQL Compare engine and designed it to be far more modular, which makes inserting procedures like this much easier.

Read the article
MySQL problem: How to get desired rows.

- by Joonas Köppä

I have been trying to solve this problem for 2 hours now but I cant understand the solutions others have given people with a similar problem. Ive seen some answers but can't apply it to my own needs. I have a table of users and their times in different sports events. I need to make a scoretable that shows the user with the best time, second best etc. The table before sorting and retrieving looks as follows: | Name | Time | Date | '''''''''''''''''''''''''''''''''''''''''''''' | Jack | 03:07:13 | 2010-12-01 | | Peter | 05:03:12 | 2010-12-03 | | Jack | 03:53:19 | 2010-12-04 | | Simon | 03:22:59 | 2010-12-02 | | Simon | 04:01:11 | 2010-12-09 | | Peter | 03:19:17 | 2010-12-06 | '''''''''''''''''''''''''''''''''''''''''''''' | Name | Time | Date | '''''''''''''''''''''''''''''''''''''''''' | Jack | 03:07:13 | 2010-12-01 | | Peter | 03:19:17 | 2010-12-06 | | Simon | 03:22:59 | 2010-12-02 | '''''''''''''''''''''''''''''''''''''''''' I know answers to this problem lie in another question asked on this very site: CLICK HERE I just have no idea how to apply it to fullfill my needs. Help is highly appreciated. Thank you -Joonas

Read the article

Search Results

Search found 933 results on 38 pages for 'simon cooper'.

Page 4/38 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by J Cooper

- by Simon Cooper

- by John Cooper

- by Tim Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Joonas Köppä

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >