Search Results

Search found 20859 results on 835 pages for 'little bobby tables'.

Page 4/835 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Mysql insert into 2 tables

- by Spidfire

I want to make a insert into 2 tables visits: visit_id int | card_id int registration: registration_id int | type enum('in','out') | timestamp int | visit_id int i want something like: INSERT INTO `visits` as v ,`registration` as v (v.`visit_id`,v.`card_id`,r.`registration_id`, r.`type`, r.`timestamp`, r.`visit_id`) VALUES (NULL, 12131141,NULL, UNIX_TIMESTAMP(), v.`visit_id`); I wonder if its possible

Read the article
Visual Studio Little Wonders: Box Selection

- by James Michael Hare

So this week I decided I’d do a Little Wonder of a different kind and focus on an underused IDE improvement: Visual Studio’s Box Selection capability. This is a handy feature that many people still don’t realize was made available in Visual Studio 2010 (and beyond). True, there have been other editors in the past with this capability, but now that it’s fully part of Visual Studio we can enjoy it’s goodness from within our own IDE. So, for those of you who don’t know what box selection is and what it allows you to do, read on! Sometimes, we want to select beyond the horizontal… The problem with traditional text selection in many editors is that it is horizontally oriented. Sure, you can select multiple rows, but if you do you will pull in the entire row (at least for the middle rows). Under the old selection scheme, if you wanted to select a portion of text from each row (a “box” of text) you were out of luck. Box selection rectifies this by allowing you to select a box of text that bounded by a selection rectangle that you can grow horizontally or vertically. So let’s think a situation that could occur where this comes in handy. Let’s say, for instance, that we are defining an enum in our code that we want to be able to translate into some string values (possibly to be stored in a database, output to screen, etc.). Perhaps such an enum would look like this: 1: public enum OrderType 2: { 3: Buy, // buy shares of a commodity 4: Sell, // sell shares of a commodity 5: Exchange, // exchange one commodity for another 6: Cancel, // cancel an order for a commodity 7: } 8: Now, let’s say we are in the process of creating a Dictionary<K,V> to translate our OrderType: 1: var translator = new Dictionary<OrderType, string> 2: { 3: // do I really want to retype all this??? 4: }; Yes the example above is contrived so that we will pull some garbage if we do a multi-line select. I could select the lines above using the traditional multi-line selection: And then paste them into the translator code, which would result in this: 1: var translator = new Dictionary<OrderType, string> 2: { 3: Buy, // buy shares of a commodity 4: Sell, // sell shares of a commodity 5: Exchange, // exchange one commodity for another 6: Cancel, // cancel an order for a commodity 7: }; But I have a lot of junk there, sure I can manually clear it out, or use some search and replace magic, but if this were hundreds of lines instead of just a few that would quickly become cumbersome. The Box Selection Now that we have the ability to create box selections, we can select the box of text to delete! Most of us are familiar with the fact we can drag the mouse (or hold [Shift] and use the arrow keys) to create a selection that can span multiple rows: Box selection, however, actually allows us to select a box instead of the typical horizontal lines: Then we can press the [delete] key and the pesky comments are all gone! You can do this either by holding down [Alt] while you select with your mouse, or by holding down [Alt+Shift] and using the arrow keys on the keyboard to grow the box horizontally or vertically. So now we have: 1: var translator = new Dictionary<OrderType, string> 2: { 3: Buy, 4: Sell, 5: Exchange, 6: Cancel, 7: }; Which is closer, but we still need an opening curly, the string to translate to, and the closing curly and comma. Fortunately, again, this is easy with box selections due to the fact box selection can even work for a zero-width selection! That is, hold down [Alt] and either drag down with no width, or hold down [Alt+Shift] and arrow down and you will define a selection range with no width, essentially, a vertical line selection: Notice the faint selection line on the right? So why is this useful? Well, just like with any selected range, we can type and it will replace the selection. What does this mean for box selections? It means that we can insert the same text all the way down on each line! If we have the same selection above, and type a curly and a space, we’d get: Imagine doing this over hundreds of lines and think of what a time saver it could be! Now make a zero-width selection on the other side: And type a curly and a comma, and we’d get: So close! Now finally, imagine we’ve already defined these strings somewhere and want to paste them in: 1: const private string BuyText = "Buy Shares"; 2: const private string SellText = "Sell Shares"; 3: const private string ExchangeText = "Exchange"; 4: const private string CancelText = "Cancel"; We can, again, use our box selection to pull out the constant names: And clicking copy (or [CTRL+C]) and then selecting a range to paste into: And finally clicking paste (or [CTRL+V]) to get the final result: 1: var translator = new Dictionary<OrderType, string> 2: { 3: { Buy, BuyText }, 4: { Sell, SellText }, 5: { Exchange, ExchangeText }, 6: { Cancel, CancelText }, 7: }; Sure, this was a contrived example, but I’m sure you’ll agree that it adds myriad possibilities of new ways to copy and paste vertical selections, as well as inserting text across a vertical slice. Summary: While box selection has been around in other editors, we finally get to experience it in VS2010 and beyond. It is extremely handy for selecting columns of information for cutting, copying, and pasting. In addition, it allows you to create a zero-width vertical insertion point that can be used to enter the same text across multiple rows. Imagine the time you can save adding repetitive code across multiple lines! Try it, the more you use it, the more you’ll love it! Technorati Tags: C#,CSharp,.NET,Visual Studio,Little Wonders,Box Selection

Read the article
C#/.NET Little Wonders – Cross Calling Constructors

- by James Michael Hare

Just a small post today, it’s the final iteration before our release and things are crazy here! This is another little tidbit that I love using, and it should be fairly common knowledge, yet I’ve noticed many times that less experienced developers tend to have redundant constructor code when they overload their constructors. The Problem – repetitive code is less maintainable Let’s say you were designing a messaging system, and so you want to create a class to represent the properties for a Receiver, so perhaps you design a ReceiverProperties class to represent this collection of properties. Perhaps, you decide to make ReceiverProperties immutable, and so you have several constructors that you can use for alternative construction: 1: // Constructs a set of receiver properties. 2: public ReceiverProperties(ReceiverType receiverType, string source, bool isDurable, bool isBuffered) 3: { 4: ReceiverType = receiverType; 5: Source = source; 6: IsDurable = isDurable; 7: IsBuffered = isBuffered; 8: } 9: 10: // Constructs a set of receiver properties with buffering on by default. 11: public ReceiverProperties(ReceiverType receiverType, string source, bool isDurable) 12: { 13: ReceiverType = receiverType; 14: Source = source; 15: IsDurable = isDurable; 16: IsBuffered = true; 17: } 18: 19: // Constructs a set of receiver properties with buffering on and durability off. 20: public ReceiverProperties(ReceiverType receiverType, string source) 21: { 22: ReceiverType = receiverType; 23: Source = source; 24: IsDurable = false; 25: IsBuffered = true; 26: } Note: keep in mind this is just a simple example for illustration, and in same cases default parameters can also help clean this up, but they have issues of their own. While strictly speaking, there is nothing wrong with this code, logically, it suffers from maintainability flaws. Consider what happens if you add a new property to the class? You have to remember to guarantee that it is set appropriately in every constructor call. This can cause subtle bugs and becomes even uglier when the constructors do more complex logic, error handling, or there are numerous potential overloads (especially if you can’t easily see them all on one screen’s height). The Solution – cross-calling constructors I’d wager nearly everyone knows how to call your base class’s constructor, but you can also cross-call to one of the constructors in the same class by using the this keyword in the same way you use base to call a base constructor. 1: // Constructs a set of receiver properties. 2: public ReceiverProperties(ReceiverType receiverType, string source, bool isDurable, bool isBuffered) 3: { 4: ReceiverType = receiverType; 5: Source = source; 6: IsDurable = isDurable; 7: IsBuffered = isBuffered; 8: } 9: 10: // Constructs a set of receiver properties with buffering on by default. 11: public ReceiverProperties(ReceiverType receiverType, string source, bool isDurable) 12: : this(receiverType, source, isDurable, true) 13: { 14: } 15: 16: // Constructs a set of receiver properties with buffering on and durability off. 17: public ReceiverProperties(ReceiverType receiverType, string source) 18: : this(receiverType, source, false, true) 19: { 20: } Notice, there is much less code. In addition, the code you have has no repetitive logic. You can define the main constructor that takes all arguments, and the remaining constructors with defaults simply cross-call the main constructor, passing in the defaults. Yes, in some cases default parameters can ease some of this for you, but default parameters only work for compile-time constants (null, string and number literals). For example, if you were creating a TradingDataAdapter that relied on an implementation of ITradingDao which is the data access object to retreive records from the database, you might want two constructors: one that takes an ITradingDao reference, and a default constructor which constructs a specific ITradingDao for ease of use: 1: public TradingDataAdapter(ITradingDao dao) 2: { 3: _tradingDao = dao; 4: 5: // other constructor logic 6: } 7: 8: public TradingDataAdapter() 9: { 10: _tradingDao = new SqlTradingDao(); 11: 12: // same constructor logic as above 13: } As you can see, this isn’t something we can solve with a default parameter, but we could with cross-calling constructors: 1: public TradingDataAdapter(ITradingDao dao) 2: { 3: _tradingDao = dao; 4: 5: // other constructor logic 6: } 7: 8: public TradingDataAdapter() 9: : this(new SqlTradingDao()) 10: { 11: } So in cases like this where you have constructors with non compiler-time constant defaults, default parameters can’t help you and cross-calling constructors is one of your best options. Summary When you have just one constructor doing the job of initializing the class, you can consolidate all your logic and error-handling in one place, thus ensuring that your behavior will be consistent across the constructor calls. This makes the code more maintainable and even easier to read. There will be some cases where cross-calling constructors may be sub-optimal or not possible (if, for example, the overloaded constructors take completely different types and are not just “defaulting” behaviors). You can also use default parameters, of course, but default parameter behavior in a class hierarchy can be problematic (default values are not inherited and in fact can differ) so sometimes multiple constructors are actually preferable. Regardless of why you may need to have multiple constructors, consider cross-calling where you can to reduce redundant logic and clean up the code. Technorati Tags: C#,.NET,Little Wonders

Read the article
A little SQL tip for C# developers

- by MikeParks

The other day at work I came across a handy little block of SQL code from Jeremiah Clark's blog. It's pretty simple logic but through the mind of a C# developer making some quick DB updates, seems to me that it's more likely to end up writing out the code in Solution 1 instead of Solution 2 below to solve the problem. Basically, I needed to check and see if a specific record existed in Table1. If it does exist, then update that record, otherwise insert a new record into Table1. Solution 1: IF EXISTS (SELECT * FROM Table1 WHERE Column1='SomeValue') UPDATE Table1 SET (...) WHERE Column1='SomeValue' ELSE INSERT INTO Table1 VALUES (...) Solution 2: UPDATE Table1 SET (...) WHERE Column1='SomeValue' IF @@ROWCOUNT=0 INSERT INTO Table1 VALUES (...) As Jeremiah explains, they both accomplish the same thing but from a performance standpoint, Solution 2 is the better way to go (saved table/index scan). Just wanted to throw this small tip out there. Thanks! - Mike

Read the article
C#/.NET Little Pitfalls: The Dangers of Casting Boxed Values

- by James Michael Hare

Starting a new series to parallel the Little Wonders series. In this series, I will examine some of the small pitfalls that can occasionally trip up developers. Introduction: Of Casts and Conversions What happens when we try to assign from an int and a double and vice-versa? 1: double pi = 3.14; 2: int theAnswer = 42; 3: 4: // implicit widening conversion, compiles! 5: double doubleAnswer = theAnswer; 6: 7: // implicit narrowing conversion, compiler error! 8: int intPi = pi; As you can see from the comments above, a conversion from a value type where there is no potential data loss is can be done with an implicit conversion. However, when converting from one value type to another may result in a loss of data, you must make the conversion explicit so the compiler knows you accept this risk. That is why the conversion from double to int will not compile with an implicit conversion, we can make the conversion explicit by adding a cast: 1: // explicit narrowing conversion using a cast, compiler 2: // succeeds, but results may have data loss: 3: int intPi = (int)pi; So for value types, the conversions (implicit and explicit) both convert the original value to a new value of the given type. With widening and narrowing references, however, this is not the case. Converting reference types is a bit different from converting value types. First of all when you perform a widening or narrowing you don’t really convert the instance of the object, you just convert the reference itself to the wider or narrower reference type, but both the original and new reference type both refer back to the same object. Secondly, widening and narrowing for reference types refers the going down and up the class hierarchy instead of referring to precision as in value types. That is, a narrowing conversion for a reference type means you are going down the class hierarchy (for example from Shape to Square) whereas a widening conversion means you are going up the class hierarchy (from Square to Shape). 1: var square = new Square(); 2: 3: // implicitly convers because all squares are shapes 4: // (that is, all subclasses can be referenced by a superclass reference) 5: Shape myShape = square; 6: 7: // implicit conversion not possible, not all shapes are squares! 8: // (that is, not all superclasses can be referenced by a subclass reference) 9: Square mySquare = (Square) myShape; So we had to cast the Shape back to Square because at that point the compiler has no way of knowing until runtime whether the Shape in question is truly a Square. But, because the compiler knows that it’s possible for a Shape to be a Square, it will compile. However, if the object referenced by myShape is not truly a Square at runtime, you will get an invalid cast exception. Of course, there are other forms of conversions as well such as user-specified conversions and helper class conversions which are beyond the scope of this post. The main thing we want to focus on is this seemingly innocuous casting method of widening and narrowing conversions that we come to depend on every day and, in some cases, can bite us if we don’t fully understand what is going on! The Pitfall: Conversions on Boxed Value Types Can Fail What if you saw the following code and – knowing nothing else – you were asked if it was legal or not, what would you think: 1: // assuming x is defined above this and this 2: // assignment is syntactically legal. 3: x = 3.14; 4: 5: // convert 3.14 to int. 6: int truncated = (int)x; You may think that since x is obviously a double (can’t be a float) because 3.14 is a double literal, but this is inaccurate. Our x could also be dynamic and this would work as well, or there could be user-defined conversions in play. But there is another, even simpler option that can often bite us: what if x is object? 1: object x; 2: 3: x = 3.14; 4: 5: int truncated = (int) x; On the surface, this seems fine. We have a double and we place it into an object which can be done implicitly through boxing (no cast) because all types inherit from object. Then we cast it to int. This theoretically should be possible because we know we can explicitly convert a double to an int through a conversion process which involves truncation. But here’s the pitfall: when casting an object to another type, we are casting a reference type, not a value type! This means that it will attempt to see at runtime if the value boxed and referred to by x is of type int or derived from type int. Since it obviously isn’t (it’s a double after all) we get an invalid cast exception! Now, you may say this looks awfully contrived, but in truth we can run into this a lot if we’re not careful. Consider using an IDataReader to read from a database, and then attempting to select a result row of a particular column type: 1: using (var connection = new SqlConnection("some connection string")) 2: using (var command = new SqlCommand("select * from employee", connection)) 3: using (var reader = command.ExecuteReader()) 4: { 5: while (reader.Read()) 6: { 7: // if the salary is not an int32 in the SQL database, this is an error! 8: // doesn't matter if short, long, double, float, reader [] returns object! 9: total += (int) reader["annual_salary"]; 10: } 11: } Notice that since the reader indexer returns object, if we attempt to convert using a cast to a type, we have to make darn sure we use the true, actual type or this will fail! If the SQL database column is a double, float, short, etc this will fail at runtime with an invalid cast exception because it attempts to convert the object reference! So, how do you get around this? There are two ways, you could first cast the object to its actual type (double), and then do a narrowing cast to on the value to int. Or you could use a helper class like Convert which analyzes the actual run-time type and will perform a conversion as long as the type implements IConvertible. 1: object x; 2: 3: x = 3.14; 4: 5: // if you want to cast, must cast out of object to double, then 6: // cast convert. 7: int truncated = (int)(double) x; 8: 9: // or you can call a helper class like Convert which examines runtime 10: // type of the value being converted 11: int anotherTruncated = Convert.ToInt32(x); Summary You should always be careful when performing a conversion cast from values boxed in object that you are actually casting to the true type (or a sub-type). Since casting from object is a widening of the reference, be careful that you either know the exact, explicit type you expect to be held in the object, or instead avoid the cast and use a helper class to perform a safe conversion to the type you desire. Technorati Tags: C#,.NET,Pitfalls,Little Pitfalls,BlackRabbitCoder

Read the article
C#/.NET Little Wonders: Comparer<T>.Default

- by James Michael Hare

I’ve been working with a wonderful team on a major release where I work, which has had the side-effect of occupying most of my spare time preparing, testing, and monitoring. However, I do have this Little Wonder tidbit to offer today. Introduction The IComparable<T> interface is great for implementing a natural order for a data type. It’s a very simple interface with a single method: 1: public interface IComparer<in T> 2: { 3: // Compare two instances of same type. 4: int Compare(T x, T y); 5: } So what do we expect for the integer return value? It’s a pseudo-relative measure of the ordering of x and y, which returns an integer value in much the same way C++ returns an integer result from the strcmp() c-style string comparison function: If x == y, returns 0. If x > y, returns > 0 (often +1, but not guaranteed) If x < y, returns < 0 (often –1, but not guaranteed) Notice that the comparison operator used to evaluate against zero should be the same comparison operator you’d use as the comparison operator between x and y. That is, if you want to see if x > y you’d see if the result > 0. The Problem: Comparing With null Can Be Messy This gets tricky though when you have null arguments. According to the MSDN, a null value should be considered equal to a null value, and a null value should be less than a non-null value. So taking this into account we’d expect this instead: If x == y (or both null), return 0. If x > y (or y only is null), return > 0. If x < y (or x only is null), return < 0. But here’s the problem – if x is null, what happens when we attempt to call CompareTo() off of x? 1: // what happens if x is null? 2: x.CompareTo(y); It’s pretty obvious we’ll get a NullReferenceException here. Now, we could guard against this before calling CompareTo(): 1: int result; 2: 3: // first check to see if lhs is null. 4: if (x == null) 5: { 6: // if lhs null, check rhs to decide on return value. 7: if (y == null) 8: { 9: result = 0; 10: } 11: else 12: { 13: result = -1; 14: } 15: } 16: else 17: { 18: // CompareTo() should handle a null y correctly and return > 0 if so. 19: result = x.CompareTo(y); 20: } Of course, we could shorten this with the ternary operator (?:), but even then it’s ugly repetitive code: 1: int result = (x == null) 2: ? ((y == null) ? 0 : -1) 3: : x.CompareTo(y); Fortunately, the null issues can be cleaned up by drafting in an external Comparer. The Soltuion: Comparer<T>.Default You can always develop your own instance of IComparer<T> for the job of comparing two items of the same type. The nice thing about a IComparer is its is independent of the things you are comparing, so this makes it great for comparing in an alternative order to the natural order of items, or when one or both of the items may be null. 1: public class NullableIntComparer : IComparer<int?> 2: { 3: public int Compare(int? x, int? y) 4: { 5: return (x == null) 6: ? ((y == null) ? 0 : -1) 7: : x.Value.CompareTo(y); 8: } 9: } Now, if you want a custom sort -- especially on large-grained objects with different possible sort fields -- this is the best option you have. But if you just want to take advantage of the natural ordering of the type, there is an easier way. If the type you want to compare already implements IComparable<T> or if the type is System.Nullable<T> where T implements IComparable, there is a class in the System.Collections.Generic namespace called Comparer<T> which exposes a property called Default that will create a singleton that represents the default comparer for items of that type. For example: 1: // compares integers 2: var intComparer = Comparer<int>.Default; 3: 4: // compares DateTime values 5: var dateTimeComparer = Comparer<DateTime>.Default; 6: 7: // compares nullable doubles using the null rules! 8: var nullableDoubleComparer = Comparer<double?>.Default; This helps you avoid having to remember the messy null logic and makes it to compare objects where you don’t know if one or more of the values is null. This works especially well when creating say an IComparer<T> implementation for a large-grained class that may or may not contain a field. For example, let’s say you want to create a sorting comparer for a stock open price, but if the market the stock is trading in hasn’t opened yet, the open price will be null. We could handle this (assuming a reasonable Quote definition) like: 1: public class Quote 2: { 3: // the opening price of the symbol quoted 4: public double? Open { get; set; } 5: 6: // ticker symbol 7: public string Symbol { get; set; } 8: 9: // etc. 10: } 11: 12: public class OpenPriceQuoteComparer : IComparer<Quote> 13: { 14: // Compares two quotes by opening price 15: public int Compare(Quote x, Quote y) 16: { 17: return Comparer<double?>.Default.Compare(x.Open, y.Open); 18: } 19: } Summary Defining a custom comparer is often needed for non-natural ordering or defining alternative orderings, but when you just want to compare two items that are IComparable<T> and account for null behavior, you can use the Comparer<T>.Default comparer generator and you’ll never have to worry about correct null value sorting again. Technorati Tags: C#,.NET,Little Wonders,BlackRabbitCoder,IComparable,Comparer

Read the article
C#/.NET Little Wonders: Interlocked Read() and Exchange()

- by James Michael Hare

Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. Last time we discussed the Interlocked class and its Add(), Increment(), and Decrement() methods which are all useful for updating a value atomically by adding (or subtracting). However, this begs the question of how do we set and read those values atomically as well? Read() – Read a value atomically Let’s begin by examining the following code: 1: public class Incrementor 2: { 3: private long _value = 0; 4: 5: public long Value { get { return _value; } } 6: 7: public void Increment() 8: { 9: Interlocked.Increment(ref _value); 10: } 11: } 12: It uses an interlocked increment, as we discuss in my previous post (here), so we know that the increment will be thread-safe. But, to realize what’s potentially wrong we have to know a bit about how atomic reads are in 32 bit and 64 bit .NET environments. When you are dealing with an item smaller or equal to the system word size (such as an int on a 32 bit system or a long on a 64 bit system) then the read is generally atomic, because it can grab all of the bits needed at once. However, when dealing with something larger than the system word size (reading a long on a 32 bit system for example), it cannot grab the whole value at once, which can lead to some problems since this read isn’t atomic. For example, this means that on a 32 bit system we may read one half of the long before another thread increments the value, and the other half of it after the increment. To protect us from reading an invalid value in this manner, we can do an Interlocked.Read() to force the read to be atomic (of course, you’d want to make sure any writes or increments are atomic also): 1: public class Incrementor 2: { 3: private long _value = 0; 4: 5: public long Value 6: { 7: get { return Interlocked.Read(ref _value); } 8: } 9: 10: public void Increment() 11: { 12: Interlocked.Increment(ref _value); 13: } 14: } Now we are guaranteed that we will read the 64 bit value atomically on a 32 bit system, thus ensuring our thread safety (assuming all other reads, writes, increments, etc. are likewise protected). Note that as stated before, and according to the MSDN (here), it isn’t strictly necessary to use Interlocked.Read() for reading 64 bit values on 64 bit systems, but for those still working in 32 bit environments, it comes in handy when dealing with long atomically. Exchange() – Exchanges two values atomically Exchange() lets us store a new value in the given location (the ref parameter) and return the old value as a result. So just as Read() allows us to read atomically, one use of Exchange() is to write values atomically. For example, if we wanted to add a Reset() method to our Incrementor, we could do something like this: 1: public void Reset() 2: { 3: _value = 0; 4: } But the assignment wouldn’t be atomic on 32 bit systems, since the word size is 32 bits and the variable is a long (64 bits). Thus our assignment could have only set half the value when a threaded read or increment happens, which would put us in a bad state. So instead, we could write Reset() like this: 1: public void Reset() 2: { 3: Interlocked.Exchange(ref _value, 0); 4: } And we’d be safe again on a 32 bit system. But this isn’t the only reason Exchange() is valuable. The key comes in realizing that Exchange() doesn’t just set a new value, it returns the old as well in an atomic step. Hence the name “exchange”: you are swapping the value to set with the stored value. So why would we want to do this? Well, anytime you want to set a value and take action based on the previous value. An example of this might be a scheme where you have several tasks, and during every so often, each of the tasks may nominate themselves to do some administrative chore. Perhaps you don’t want to make this thread dedicated for whatever reason, but want to be robust enough to let any of the threads that isn’t currently occupied nominate itself for the job. An easy and lightweight way to do this would be to have a long representing whether someone has acquired the “election” or not. So a 0 would indicate no one has been elected and 1 would indicate someone has been elected. We could then base our nomination strategy as follows: every so often, a thread will attempt an Interlocked.Exchange() on the long and with a value of 1. The first thread to do so will set it to a 1 and return back the old value of 0. We can use this to show that they were the first to nominate and be chosen are thus “in charge”. Anyone who nominates after that will attempt the same Exchange() but will get back a value of 1, which indicates that someone already had set it to a 1 before them, thus they are not elected. Then, the only other step we need take is to remember to release the election flag once the elected thread accomplishes its task, which we’d do by setting the value back to 0. In this way, the next thread to nominate with Exchange() will get back the 0 letting them know they are the new elected nominee. Such code might look like this: 1: public class Nominator 2: { 3: private long _nomination = 0; 4: public bool Elect() 5: { 6: return Interlocked.Exchange(ref _nomination, 1) == 0; 7: } 8: public bool Release() 9: { 10: return Interlocked.Exchange(ref _nomination, 0) == 1; 11: } 12: } There’s many ways to do this, of course, but you get the idea. Running 5 threads doing some “sleep” work might look like this: 1: var nominator = new Nominator(); 2: var random = new Random(); 3: Parallel.For(0, 5, i => 4: { 5: 6: for (int j = 0; j < _iterations; ++j) 7: { 8: if (nominator.Elect()) 9: { 10: // elected 11: Console.WriteLine("Elected nominee " + i); 12: Thread.Sleep(random.Next(100, 5000)); 13: nominator.Release(); 14: } 15: else 16: { 17: // not elected 18: Console.WriteLine("Did not elect nominee " + i); 19: } 20: // sleep before check again 21: Thread.Sleep(1000); 22: } 23: }); And would spit out results like: 1: Elected nominee 0 2: Did not elect nominee 2 3: Did not elect nominee 1 4: Did not elect nominee 4 5: Did not elect nominee 3 6: Did not elect nominee 3 7: Did not elect nominee 1 8: Did not elect nominee 2 9: Did not elect nominee 4 10: Elected nominee 3 11: Did not elect nominee 2 12: Did not elect nominee 1 13: Did not elect nominee 4 14: Elected nominee 0 15: Did not elect nominee 2 16: Did not elect nominee 4 17: ... Another nice thing about the Interlocked.Exchange() is it can be used to thread-safely set pretty much anything 64 bits or less in size including references, pointers (in unsafe mode), floats, doubles, etc. Summary So, now we’ve seen two more things we can do with Interlocked: reading and exchanging a value atomically. Read() and Exchange() are especially valuable for reading/writing 64 bit values atomically in a 32 bit system. Exchange() has value even beyond simply atomic writes by using the Exchange() to your advantage, since it reads and set the value atomically, which allows you to do lightweight nomination systems. There’s still a few more goodies in the Interlocked class which we’ll explore next time! Technorati Tags: C#,CSharp,.NET,Little Wonders,Interlocked

Read the article
Microsoft SQL Server 2005 Inserting into tables from child procedure which returned multiple tables

- by Kevin

I've got a child procedure which returns more than table. child: PROCEDURE KevinGetTwoTables AS BEGIN SELECT 'ABC' Alpha, '123' Numeric SELECT 'BBB' Alpha, '123' Numeric1, '555' Numeric2 END example: PROCEDURE KevinTesting AS BEGIN DECLARE @Table1 TABLE ( Alpha varchar(50), Numeric1 int ) DECLARE @Table2 TABLE ( Alpha varchar(50), Numeric1 int, Numeric2 int ) --INSERT INTO @Table1 EXEC KevinGetTwoTables END

Read the article
SQL Server, temporary tables with truncate vs table variable with delete

- by Richard

I have a stored procedure inside which I create a temporary table that typically contains between 1 and 10 rows. This table is truncated and filled many times during the stored procedure. It is truncated as this is faster than delete. Do I get any performance increase by replacing this temporary table with a table variable when I suffer a penalty for using delete (truncate does not work on table variables) Whilst table variables are mainly in memory and are generally faster than temp tables do I loose any benefit by having to delete rather than truncate?

Read the article
Chained Hash Tables vs. Open-Addressed Hash Tables

- by nomemory

Can somebody explain the main differences between (advantages / disadvantages) the two implementations? For a library, what implementation is recommended?

Read the article
Join performance on MyISAM and InnoDB tables

- by j0nes

I am thinking about converting some tables from MyISAM to InnoDB in my mysql server. The tables will certainly benefit from the change because a lot of write requests come to these tables, while there are also quite a lot of read request at the same time. However, they are often joined together with some tables that almost don't get any writes. Is there a performance penalty when joining together MyISAM and InnoDB tables or should everything work fine? Second question: During backups at night, I am copying data from the InnoDB tables to MyISAM tables for archiving purposes. In these backups, a lot of write-requests happen, however there is almost no read from these archive tables. Would these tables also benefit from using InnoDB or is this just a waste of space and RAM?

Read the article
Does my AMD-based machine use little endian or big endian?

- by Frank

I'm going though a computers system course and I'm trying to establish, for sure, if my AMD based computer is a little endian machine? I believe it is because it would be Intel-compatible. Specifically, my processor is an AMD 64 Athlon x2. I understand that this can matter in C programming. I'm writing C programs and a method I'm using would be affected by this. I'm trying to figure out if I'd get the same results if I ran the program on an Intel based machine (assuming that is little endian machine). Finally, let me ask this: Would any and all machines capable of running Windows (XP, Vista, 2000, Server 2003, etc) and, say, Ubuntu Linux desktop be little endian? Thank You, Frank

Read the article
Concatenating rows from different tables into one field

- by Markus

Hi! In a project using a MSSQL 2005 Database we are required to log all data manipulating actions in a logging table. One field in that table is supposed to contain the row before it was changed. We have a lot of tables so I was trying to write a stored procedure that would gather up all the fields in one row of a table that was given to it, concatenate them somehow and then write a new log entry with that information. I already tried using FOR XML PATH and it worked, but the client doesn't like the XML notation, they want a csv field. Here's what I had with FOR XML PATH: DECLARE @foo varchar(max); SET @foo = (SELECT * FROM table WHERE id = 5775 FOR XML PATH('')); The values for "table", "id" and the actual id (here: 5775) would later be passed in via the call to the stored procedure. Is there any way to do this without getting XML notation and without knowing in advance which fields are going to be returned by the SELECT statement?

Read the article
PHP/MySQL - updateing 2 tables in one request

- by Phil Jackson

Morning, I want to learn more about sql and I'm wanting to update to tables; $query3 = "INSERT INTO `$table1`, `$table2` ($table1.DISPLAY_NAME, $table1.EMAIL_ACCOUNT, $table2.DISPLAY_NAME, $table2.EMAIL_ACCOUNT) values ('" . DISPLAY_NAME . "', '" . EMAIL_ADDRESS . "', '" . $get['rn'] . "', '" . $email . "')"; could some one point me in the right direction on how I would go about this? current error is You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ' contacts_ACT_Web_Designs (contacts_E_Jackson.DISPLAY_NAME, contacts_E_Jackson' at line 1 regards, phil

Read the article
C#/.NET Little Wonders: The Predicate, Comparison, and Converter Generic Delegates

- by James Michael Hare

Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. In the last three weeks, we examined the Action family of delegates (and delegates in general), the Func family of delegates, and the EventHandler family of delegates and how they can be used to support generic, reusable algorithms and classes. This week I will be completing my series on the generic delegates in the .NET Framework with a discussion of three more, somewhat less used, generic delegates: Predicate<T>, Comparison<T>, and Converter<TInput, TOutput>. These are older generic delegates that were introduced in .NET 2.0, mostly for use in the Array and List<T> classes. Though older, it’s good to have an understanding of them and their intended purpose. In addition, you can feel free to use them yourself, though obviously you can also use the equivalents from the Func family of delegates instead. Predicate<T> – delegate for determining matches The Predicate<T> delegate was a very early delegate developed in the .NET 2.0 Framework to determine if an item was a match for some condition in a List<T> or T[]. The methods that tend to use the Predicate<T> include: Find(), FindAll(), FindLast() Uses the Predicate<T> delegate to finds items, in a list/array of type T, that matches the given predicate. FindIndex(), FindLastIndex() Uses the Predicate<T> delegate to find the index of an item, of in a list/array of type T, that matches the given predicate. The signature of the Predicate<T> delegate (ignoring variance for the moment) is: 1: public delegate bool Predicate<T>(T obj); So, this is a delegate type that supports any method taking an item of type T and returning bool. In addition, there is a semantic understanding that this predicate is supposed to be examining the item supplied to see if it matches a given criteria. 1: // finds first even number (2) 2: var firstEven = Array.Find(numbers, n => (n % 2) == 0); 3: 4: // finds all odd numbers (1, 3, 5, 7, 9) 5: var allEvens = Array.FindAll(numbers, n => (n % 2) == 1); 6: 7: // find index of first multiple of 5 (4) 8: var firstFiveMultiplePos = Array.FindIndex(numbers, n => (n % 5) == 0); This delegate has typically been succeeded in LINQ by the more general Func family, so that Predicate<T> and Func<T, bool> are logically identical. Strictly speaking, though, they are different types, so a delegate reference of type Predicate<T> cannot be directly assigned to a delegate reference of type Func<T, bool>, though the same method can be assigned to both. 1: // SUCCESS: the same lambda can be assigned to either 2: Predicate<DateTime> isSameDayPred = dt => dt.Date == DateTime.Today; 3: Func<DateTime, bool> isSameDayFunc = dt => dt.Date == DateTime.Today; 4: 5: // ERROR: once they are assigned to a delegate type, they are strongly 6: // typed and cannot be directly assigned to other delegate types. 7: isSameDayPred = isSameDayFunc; When you assign a method to a delegate, all that is required is that the signature matches. This is why the same method can be assigned to either delegate type since their signatures are the same. However, once the method has been assigned to a delegate type, it is now a strongly-typed reference to that delegate type, and it cannot be assigned to a different delegate type (beyond the bounds of variance depending on Framework version, of course). Comparison<T> – delegate for determining order Just as the Predicate<T> generic delegate was birthed to give Array and List<T> the ability to perform type-safe matching, the Comparison<T> was birthed to give them the ability to perform type-safe ordering. The Comparison<T> is used in Array and List<T> for: Sort() A form of the Sort() method that takes a comparison delegate; this is an alternate way to custom sort a list/array from having to define custom IComparer<T> classes. The signature for the Comparison<T> delegate looks like (without variance): 1: public delegate int Comparison<T>(T lhs, T rhs); The goal of this delegate is to compare the left-hand-side to the right-hand-side and return a negative number if the lhs < rhs, zero if they are equal, and a positive number if the lhs > rhs. Generally speaking, null is considered to be the smallest value of any reference type, so null should always be less than non-null, and two null values should be considered equal. In most sort/ordering methods, you must specify an IComparer<T> if you want to do custom sorting/ordering. The Array and List<T> types, however, also allow for an alternative Comparison<T> delegate to be used instead, essentially, this lets you perform the custom sort without having to have the custom IComparer<T> class defined. It should be noted, however, that the LINQ OrderBy(), and ThenBy() family of methods do not support the Comparison<T> delegate (though one could easily add their own extension methods to create one, or create an IComparer() factory class that generates one from a Comparison<T>). So, given this delegate, we could use it to perform easy sorts on an Array or List<T> based on custom fields. Say for example we have a data class called Employee with some basic employee information: 1: public sealed class Employee 2: { 3: public string Name { get; set; } 4: public int Id { get; set; } 5: public double Salary { get; set; } 6: } And say we had a List<Employee> that contained data, such as: 1: var employees = new List<Employee> 2: { 3: new Employee { Name = "John Smith", Id = 2, Salary = 37000.0 }, 4: new Employee { Name = "Jane Doe", Id = 1, Salary = 57000.0 }, 5: new Employee { Name = "John Doe", Id = 5, Salary = 60000.0 }, 6: new Employee { Name = "Jane Smith", Id = 3, Salary = 59000.0 } 7: }; Now, using the Comparison<T> delegate form of Sort() on the List<Employee>, we can sort our list many ways: 1: // sort based on employee ID 2: employees.Sort((lhs, rhs) => Comparer<int>.Default.Compare(lhs.Id, rhs.Id)); 3: 4: // sort based on employee name 5: employees.Sort((lhs, rhs) => string.Compare(lhs.Name, rhs.Name)); 6: 7: // sort based on salary, descending (note switched lhs/rhs order for descending) 8: employees.Sort((lhs, rhs) => Comparer<double>.Default.Compare(rhs.Salary, lhs.Salary)); So again, you could use this older delegate, which has a lot of logical meaning to it’s name, or use a generic delegate such as Func<T, T, int> to implement the same sort of behavior. All this said, one of the reasons, in my opinion, that Comparison<T> isn’t used too often is that it tends to need complex lambdas, and the LINQ ability to order based on projections is much easier to use, though the Array and List<T> sorts tend to be more efficient if you want to perform in-place ordering. Converter<TInput, TOutput> – delegate to convert elements The Converter<TInput, TOutput> delegate is used by the Array and List<T> delegate to specify how to convert elements from an array/list of one type (TInput) to another type (TOutput). It is used in an array/list for: ConvertAll() Converts all elements from a List<TInput> / TInput[] to a new List<TOutput> / TOutput[]. The delegate signature for Converter<TInput, TOutput> is very straightforward (ignoring variance): 1: public delegate TOutput Converter<TInput, TOutput>(TInput input); So, this delegate’s job is to taken an input item (of type TInput) and convert it to a return result (of type TOutput). Again, this is logically equivalent to a newer Func delegate with a signature of Func<TInput, TOutput>. In fact, the latter is how the LINQ conversion methods are defined. So, we could use the ConvertAll() syntax to convert a List<T> or T[] to different types, such as: 1: // get a list of just employee IDs 2: var empIds = employees.ConvertAll(emp => emp.Id); 3: 4: // get a list of all emp salaries, as int instead of double: 5: var empSalaries = employees.ConvertAll(emp => (int)emp.Salary); Note that the expressions above are logically equivalent to using LINQ’s Select() method, which gives you a lot more power: 1: // get a list of just employee IDs 2: var empIds = employees.Select(emp => emp.Id).ToList(); 3: 4: // get a list of all emp salaries, as int instead of double: 5: var empSalaries = employees.Select(emp => (int)emp.Salary).ToList(); The only difference with using LINQ is that many of the methods (including Select()) are deferred execution, which means that often times they will not perform the conversion for an item until it is requested. This has both pros and cons in that you gain the benefit of not performing work until it is actually needed, but on the flip side if you want the results now, there is overhead in the behind-the-scenes work that support deferred execution (it’s supported by the yield return / yield break keywords in C# which define iterators that maintain current state information). In general, the new LINQ syntax is preferred, but the older Array and List<T> ConvertAll() methods are still around, as is the Converter<TInput, TOutput> delegate. Sidebar: Variance support update in .NET 4.0 Just like our descriptions of Func and Action, these three early generic delegates also support more variance in assignment as of .NET 4.0. Their new signatures are: 1: // comparison is contravariant on type being compared 2: public delegate int Comparison<in T>(T lhs, T rhs); 3: 4: // converter is contravariant on input and covariant on output 5: public delegate TOutput Contravariant<in TInput, out TOutput>(TInput input); 6: 7: // predicate is contravariant on input 8: public delegate bool Predicate<in T>(T obj); Thus these delegates can now be assigned to delegates allowing for contravariance (going to a more derived type) or covariance (going to a less derived type) based on whether the parameters are input or output, respectively. Summary Today, we wrapped up our generic delegates discussion by looking at three lesser-used delegates: Predicate<T>, Comparison<T>, and Converter<TInput, TOutput>. All three of these tend to be replaced by their more generic Func equivalents in LINQ, but that doesn’t mean you shouldn’t understand what they do or can’t use them for your own code, as they do contain semantic meanings in their names that sometimes get lost in the more generic Func name. Tweet Technorati Tags: C#,CSharp,.NET,Little Wonders,delegates,generics,Predicate,Converter,Comparison

Read the article
MySQL – Beginning Temporary Tables in MySQL

- by Pinal Dave

MySQL supports Temporary tables to store the resultsets temporarily for a given connection. Temporary tables are created with the keyword TEMPORARY along with the CREATE TABLE statement. Let us create the temporary table named Temp CREATE TEMPORARY TABLE TEMP (id INT); Now you can find out the column names using DESC command DESC TEMP; The above returns the following result This table can be accessed only for the current connection and it can be used like a permanent table and automatically dropped when the connection is closed. However, you can not find temporary tables using INFORMATION_SCHEMA. TABLES system view. It will only list out the permanent tables. MySQL usually stores the data of temporary tables in memory and processed by Memory Storage engine. But if the data size is too large MySQL automatically converts this to the on – disk table and use MyISAM engine. You can also create a permanent table with the same name of a temporary table in the same connection. However the structure of permanent table is visible only if the temporary table with the same name is dropped. Let us create a permanent table with the same name Temp as below CREATE TABLE TEMP (id INT, names VARCHAR(100)); Now running the following command stills gives you the structure of the temporary table temp created earlier. DESC TEMP; You can drop the temporary table using DROP TEMPORARY TABLE command; DROP TEMPORARY TABLE TEMP; After you executed the temporary table, run the following command DESC TEMP; Now you will see the structure of the permanent table named temp In summary – If there is a Temporary Table in MySQL it gets first priority over the permanent table in the session. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: MySQL, PostADay, SQL, SQL Authority, SQL Query, SQL Tips and Tricks, T SQL

Read the article
C#/.NET Little Wonders: Use Cast() and TypeOf() to Change Sequence Type

- by James Michael Hare

Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. We’ve seen how the Select() extension method lets you project a sequence from one type to a new type which is handy for getting just parts of items, or building new items. But what happens when the items in the sequence are already the type you want, but the sequence itself is typed to an interface or super-type instead of the sub-type you need? For example, you may have a sequence of Rectangle stored in an IEnumerable<Shape> and want to consider it an IEnumerable<Rectangle> sequence instead. Today we’ll look at two handy extension methods, Cast<TResult>() and OfType<TResult>() which help you with this task. Cast<TResult>() – Attempt to cast all items to type TResult So, the first thing we can do would be to attempt to create a sequence of TResult from every item in the source sequence. Typically we’d do this if we had an IEnumerable<T> where we knew that every item was actually a TResult where TResult inherits/implements T. For example, assume the typical Shape example classes: 1: // abstract base class 2: public abstract class Shape { } 3: 4: // a basic rectangle 5: public class Rectangle : Shape 6: { 7: public int Widtgh { get; set; } 8: public int Height { get; set; } 9: } And let’s assume we have a sequence of Shape where every Shape is a Rectangle… 1: var shapes = new List<Shape> 2: { 3: new Rectangle { Width = 3, Height = 5 }, 4: new Rectangle { Width = 10, Height = 13 }, 5: // ... 6: }; To get the sequence of Shape as a sequence of Rectangle, of course, we could use a Select() clause, such as: 1: // select each Shape, cast it to Rectangle 2: var rectangles = shapes 3: .Select(s => (Rectangle)s) 4: .ToList(); But that’s a bit verbose, and fortunately there is already a facility built in and ready to use in the form of the Cast<TResult>() extension method: 1: // cast each item to Rectangle and store in a List<Rectangle> 2: var rectangles = shapes 3: .Cast<Rectangle>() 4: .ToList(); However, we should note that if anything in the list cannot be cast to a Rectangle, you will get an InvalidCastException thrown at runtime. Thus, if our Shape sequence had a Circle in it, the call to Cast<Rectangle>() would have failed. As such, you should only do this when you are reasonably sure of what the sequence actually contains (or are willing to handle an exception if you’re wrong). Another handy use of Cast<TResult>() is using it to convert an IEnumerable to an IEnumerable<T>. If you look at the signature, you’ll see that the Cast<TResult>() extension method actually extends the older, object-based IEnumerable interface instead of the newer, generic IEnumerable<T>. This is your gateway method for being able to use LINQ on older, non-generic sequences. For example, consider the following: 1: // the older, non-generic collections are sequence of object 2: var shapes = new ArrayList 3: { 4: new Rectangle { Width = 3, Height = 13 }, 5: new Rectangle { Width = 10, Height = 20 }, 6: // ... 7: }; Since this is an older, object based collection, we cannot use the LINQ extension methods on it directly. For example, if I wanted to query the Shape sequence for only those Rectangles whose Width is > 5, I can’t do this: 1: // compiler error, Where() operates on IEnumerable<T>, not IEnumerable 2: var bigRectangles = shapes.Where(r => r.Width > 5); However, I can use Cast<Rectangle>() to treat my ArrayList as an IEnumerable<Rectangle> and then do the query! 1: // ah, that’s better! 2: var bigRectangles = shapes.Cast<Rectangle>().Where(r => r.Width > 5); Or, if you prefer, in LINQ query expression syntax: 1: var bigRectangles = from s in shapes.Cast<Rectangle>() 2: where s.Width > 5 3: select s; One quick warning: Cast<TResult>() only attempts to cast, it won’t perform a cast conversion. That is, consider this: 1: var intList = new List<int> { 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89 }; 2: 3: // casting ints to longs, this should work, right? 4: var asLong = intList.Cast<long>().ToList(); Will the code above work? No, you’ll get a InvalidCastException. Remember that Cast<TResult>() is an extension of IEnumerable, thus it is a sequence of object, which means that it will box every int as an object as it enumerates over it, and there is no cast conversion from object to long, and thus the cast fails. In other words, a cast from int to long will succeed because there is a conversion from int to long. But a cast from int to object to long will not, because you can only unbox an item by casting it to its exact type. For more information on why cast-converting boxed values doesn’t work, see this post on The Dangers of Casting Boxed Values (here). OfType<TResult>() – Filter sequence to only items of type TResult So, we’ve seen how we can use Cast<TResult>() to change the type of our sequence, when we expect all the items of the sequence to be of a specific type. But what do we do when a sequence contains many different types, and we are only concerned with a subset of a given type? For example, what if a sequence of Shape contains Rectangle and Circle instances, and we just want to select all of the Rectangle instances? Well, let’s say we had this sequence of Shape: 1: var shapes = new List<Shape> 2: { 3: new Rectangle { Width = 3, Height = 5 }, 4: new Rectangle { Width = 10, Height = 13 }, 5: new Circle { Radius = 10 }, 6: new Square { Side = 13 }, 7: // ... 8: }; Well, we could get the rectangles using Select(), like: 1: var onlyRectangles = shapes.Where(s => s is Rectangle).ToList(); But fortunately, an easier way has already been written for us in the form of the OfType<T>() extension method: 1: // returns only a sequence of the shapes that are Rectangles 2: var onlyRectangles = shapes.OfType<Rectangle>().ToList(); Now we have a sequence of only the Rectangles in the original sequence, we can also use this to chain other queries that depend on Rectangles, such as: 1: // select only Rectangles, then filter to only those more than 2: // 5 units wide... 3: var onlyBigRectangles = shapes.OfType<Rectangle>() 4: .Where(r => r.Width > 5) 5: .ToList(); The OfType<Rectangle>() will filter the sequence to only the items that are of type Rectangle (or a subclass of it), and that results in an IEnumerable<Rectangle>, we can then apply the other LINQ extension methods to query that list further. Just as Cast<TResult>() is an extension method on IEnumerable (and not IEnumerable<T>), the same is true for OfType<T>(). This means that you can use OfType<TResult>() on object-based collections as well. For example, given an ArrayList containing Shapes, as below: 1: // object-based collections are a sequence of object 2: var shapes = new ArrayList 3: { 4: new Rectangle { Width = 3, Height = 5 }, 5: new Rectangle { Width = 10, Height = 13 }, 6: new Circle { Radius = 10 }, 7: new Square { Side = 13 }, 8: // ... 9: }; We can use OfType<Rectangle> to filter the sequence to only Rectangle items (and subclasses), and then chain other LINQ expressions, since we will then be of type IEnumerable<Rectangle>: 1: // OfType() converts the sequence of object to a new sequence 2: // containing only Rectangle or sub-types of Rectangle. 3: var onlyBigRectangles = shapes.OfType<Rectangle>() 4: .Where(r => r.Width > 5) 5: .ToList(); Summary So now we’ve seen two different ways to get a sequence of a superclass or interface down to a more specific sequence of a subclass or implementation. The Cast<TResult>() method casts every item in the source sequence to type TResult, and the OfType<TResult>() method selects only those items in the source sequence that are of type TResult. You can use these to downcast sequences, or adapt older types and sequences that only implement IEnumerable (such as DataTable, ArrayList, etc.). Technorati Tags: C#,CSharp,.NET,LINQ,Little Wonders,TypeOf,Cast,IEnumerable<T>

Read the article
Search multiple tables

- by gilden

I have developed a web application that is used mainly for archiving all sorts of textual material (documents, references to articles, books, magazines etc.). There can be any given number of archive tables in my system, each with its own schema. The schema can be changed by a moderator through the application (imagine something similar to a really dumbed down version of phpMyAdmin). Users can search for anything from all of the tables. By using FULLTEXT indexes together with substring searching (fields which do not support FULLTEXT indexing) the script inserts the results of a search to a single table and by ordering these results by the similarity measure I can fairly easily return the paginated results. However, this approach has a few problems: substring searching can only count exact results the 50% rule applies to all tables separately and thus, mysql may not return important matches or too naively discards common words. is quite expensive in terms of query numbers and execution time (not an issue right now as there's not a lot of data yet in the tables). normalized data is not even searched for (I have different tables for categories, languages and file attatchments). My planned solution Create a single table having columns similar to id, table_id, row_id, data Every time a new row is created/modified/deleted in any of the data tables this central table also gets updated with the data column containing a concatenation of all the fields in a row. I could then create a single index for Sphinx and use it for doing searches instead. Are there any more efficient solutions or best practises how to approach this? Thanks.

Read the article
C#/.NET Little Wonders: The Concurrent Collections (1 of 3)

- by James Michael Hare

Once again we consider some of the lesser known classes and keywords of C#. In the next few weeks, we will discuss the concurrent collections and how they have changed the face of concurrent programming. This week’s post will begin with a general introduction and discuss the ConcurrentStack<T> and ConcurrentQueue<T>. Then in the following post we’ll discuss the ConcurrentDictionary<T> and ConcurrentBag<T>. Finally, we shall close on the third post with a discussion of the BlockingCollection<T>. For more of the "Little Wonders" posts, see the index here. A brief history of collections In the beginning was the .NET 1.0 Framework. And out of this framework emerged the System.Collections namespace, and it was good. It contained all the basic things a growing programming language needs like the ArrayList and Hashtable collections. The main problem, of course, with these original collections is that they held items of type object which means you had to be disciplined enough to use them correctly or you could end up with runtime errors if you got an object of a type you weren't expecting. Then came .NET 2.0 and generics and our world changed forever! With generics the C# language finally got an equivalent of the very powerful C++ templates. As such, the System.Collections.Generic was born and we got type-safe versions of all are favorite collections. The List<T> succeeded the ArrayList and the Dictionary<TKey,TValue> succeeded the Hashtable and so on. The new versions of the library were not only safer because they checked types at compile-time, in many cases they were more performant as well. So much so that it's Microsoft's recommendation that the System.Collections original collections only be used for backwards compatibility. So we as developers came to know and love the generic collections and took them into our hearts and embraced them. The problem is, thread safety in both the original collections and the generic collections can be problematic, for very different reasons. Now, if you are only doing single-threaded development you may not care – after all, no locking is required. Even if you do have multiple threads, if a collection is “load-once, read-many” you don’t need to do anything to protect that container from multi-threaded access, as illustrated below: 1: public static class OrderTypeTranslator 2: { 3: // because this dictionary is loaded once before it is ever accessed, we don't need to synchronize 4: // multi-threaded read access 5: private static readonly Dictionary<string, char> _translator = new Dictionary<string, char> 6: { 7: {"New", 'N'}, 8: {"Update", 'U'}, 9: {"Cancel", 'X'} 10: }; 11: 12: // the only public interface into the dictionary is for reading, so inherently thread-safe 13: public static char? Translate(string orderType) 14: { 15: char charValue; 16: if (_translator.TryGetValue(orderType, out charValue)) 17: { 18: return charValue; 19: } 20: 21: return null; 22: } 23: } Unfortunately, most of our computer science problems cannot get by with just single-threaded applications or with multi-threading in a load-once manner. Looking at today's trends, it's clear to see that computers are not so much getting faster because of faster processor speeds -- we've nearly reached the limits we can push through with today's technologies -- but more because we're adding more cores to the boxes. With this new hardware paradigm, it is even more important to use multi-threaded applications to take full advantage of parallel processing to achieve higher application speeds. So let's look at how to use collections in a thread-safe manner. Using historical collections in a concurrent fashion The early .NET collections (System.Collections) had a Synchronized() static method that could be used to wrap the early collections to make them completely thread-safe. This paradigm was dropped in the generic collections (System.Collections.Generic) because having a synchronized wrapper resulted in atomic locks for all operations, which could prove overkill in many multithreading situations. Thus the paradigm shifted to having the user of the collection specify their own locking, usually with an external object: 1: public class OrderAggregator 2: { 3: private static readonly Dictionary<string, List<Order>> _orders = new Dictionary<string, List<Order>>(); 4: private static readonly _orderLock = new object(); 5: 6: public void Add(string accountNumber, Order newOrder) 7: { 8: List<Order> ordersForAccount; 9: 10: // a complex operation like this should all be protected 11: lock (_orderLock) 12: { 13: if (!_orders.TryGetValue(accountNumber, out ordersForAccount)) 14: { 15: _orders.Add(accountNumber, ordersForAccount = new List<Order>()); 16: } 17: 18: ordersForAccount.Add(newOrder); 19: } 20: } 21: } Notice how we’re performing several operations on the dictionary under one lock. With the Synchronized() static methods of the early collections, you wouldn’t be able to specify this level of locking (a more macro-level). So in the generic collections, it was decided that if a user needed synchronization, they could implement their own locking scheme instead so that they could provide synchronization as needed. The need for better concurrent access to collections Here’s the problem: it’s relatively easy to write a collection that locks itself down completely for access, but anything more complex than that can be difficult and error-prone to write, and much less to make it perform efficiently! For example, what if you have a Dictionary that has frequent reads but in-frequent updates? Do you want to lock down the entire Dictionary for every access? This would be overkill and would prevent concurrent reads. In such cases you could use something like a ReaderWriterLockSlim which allows for multiple readers in a lock, and then once a writer grabs the lock it blocks all further readers until the writer is done (in a nutshell). This is all very complex stuff to consider. Fortunately, this is where the Concurrent Collections come in. The Parallel Computing Platform team at Microsoft went through great pains to determine how to make a set of concurrent collections that would have the best performance characteristics for general case multi-threaded use. Now, as in all things involving threading, you should always make sure you evaluate all your container options based on the particular usage scenario and the degree of parallelism you wish to acheive. This article should not be taken to understand that these collections are always supperior to the generic collections. Each fills a particular need for a particular situation. Understanding what each container is optimized for is key to the success of your application whether it be single-threaded or multi-threaded. General points to consider with the concurrent collections The MSDN points out that the concurrent collections all support the ICollection interface. However, since the collections are already synchronized, the IsSynchronized property always returns false, and SyncRoot always returns null. Thus you should not attempt to use these properties for synchronization purposes. Note that since the concurrent collections also may have different operations than the traditional data structures you may be used to. Now you may ask why they did this, but it was done out of necessity to keep operations safe and atomic. For example, in order to do a Pop() on a stack you have to know the stack is non-empty, but between the time you check the stack’s IsEmpty property and then do the Pop() another thread may have come in and made the stack empty! This is why some of the traditional operations have been changed to make them safe for concurrent use. In addition, some properties and methods in the concurrent collections achieve concurrency by creating a snapshot of the collection, which means that some operations that were traditionally O(1) may now be O(n) in the concurrent models. I’ll try to point these out as we talk about each collection so you can be aware of any potential performance impacts. Finally, all the concurrent containers are safe for enumeration even while being modified, but some of the containers support this in different ways (snapshot vs. dirty iteration). Once again I’ll highlight how thread-safe enumeration works for each collection. ConcurrentStack<T>: The thread-safe LIFO container The ConcurrentStack<T> is the thread-safe counterpart to the System.Collections.Generic.Stack<T>, which as you may remember is your standard last-in-first-out container. If you think of algorithms that favor stack usage (for example, depth-first searches of graphs and trees) then you can see how using a thread-safe stack would be of benefit. The ConcurrentStack<T> achieves thread-safe access by using System.Threading.Interlocked operations. This means that the multi-threaded access to the stack requires no traditional locking and is very, very fast! For the most part, the ConcurrentStack<T> behaves like it’s Stack<T> counterpart with a few differences: Pop() was removed in favor of TryPop() Returns true if an item existed and was popped and false if empty. PushRange() and TryPopRange() were added Allows you to push multiple items and pop multiple items atomically. Count takes a snapshot of the stack and then counts the items. This means it is a O(n) operation, if you just want to check for an empty stack, call IsEmpty instead which is O(1). ToArray() and GetEnumerator() both also take snapshots. This means that iteration over a stack will give you a static view at the time of the call and will not reflect updates. Pushing on a ConcurrentStack<T> works just like you’d expect except for the aforementioned PushRange() method that was added to allow you to push a range of items concurrently. 1: var stack = new ConcurrentStack<string>(); 2: 3: // adding to stack is much the same as before 4: stack.Push("First"); 5: 6: // but you can also push multiple items in one atomic operation (no interleaves) 7: stack.PushRange(new [] { "Second", "Third", "Fourth" }); For looking at the top item of the stack (without removing it) the Peek() method has been removed in favor of a TryPeek(). This is because in order to do a peek the stack must be non-empty, but between the time you check for empty and the time you execute the peek the stack contents may have changed. Thus the TryPeek() was created to be an atomic check for empty, and then peek if not empty: 1: // to look at top item of stack without removing it, can use TryPeek. 2: // Note that there is no Peek(), this is because you need to check for empty first. TryPeek does. 3: string item; 4: if (stack.TryPeek(out item)) 5: { 6: Console.WriteLine("Top item was " + item); 7: } 8: else 9: { 10: Console.WriteLine("Stack was empty."); 11: } Finally, to remove items from the stack, we have the TryPop() for single, and TryPopRange() for multiple items. Just like the TryPeek(), these operations replace Pop() since we need to ensure atomically that the stack is non-empty before we pop from it: 1: // to remove items, use TryPop or TryPopRange to get multiple items atomically (no interleaves) 2: if (stack.TryPop(out item)) 3: { 4: Console.WriteLine("Popped " + item); 5: } 6: 7: // TryPopRange will only pop up to the number of spaces in the array, the actual number popped is returned. 8: var poppedItems = new string[2]; 9: int numPopped = stack.TryPopRange(poppedItems); 10: 11: foreach (var theItem in poppedItems.Take(numPopped)) 12: { 13: Console.WriteLine("Popped " + theItem); 14: } Finally, note that as stated before, GetEnumerator() and ToArray() gets a snapshot of the data at the time of the call. That means if you are enumerating the stack you will get a snapshot of the stack at the time of the call. This is illustrated below: 1: var stack = new ConcurrentStack<string>(); 2: 3: // adding to stack is much the same as before 4: stack.Push("First"); 5: 6: var results = stack.GetEnumerator(); 7: 8: // but you can also push multiple items in one atomic operation (no interleaves) 9: stack.PushRange(new [] { "Second", "Third", "Fourth" }); 10: 11: while(results.MoveNext()) 12: { 13: Console.WriteLine("Stack only has: " + results.Current); 14: } The only item that will be printed out in the above code is "First" because the snapshot was taken before the other items were added. This may sound like an issue, but it’s really for safety and is more correct. You don’t want to enumerate a stack and have half a view of the stack before an update and half a view of the stack after an update, after all. In addition, note that this is still thread-safe, whereas iterating through a non-concurrent collection while updating it in the old collections would cause an exception. ConcurrentQueue<T>: The thread-safe FIFO container The ConcurrentQueue<T> is the thread-safe counterpart of the System.Collections.Generic.Queue<T> class. The concurrent queue uses an underlying list of small arrays and lock-free System.Threading.Interlocked operations on the head and tail arrays. Once again, this allows us to do thread-safe operations without the need for heavy locks! The ConcurrentQueue<T> (like the ConcurrentStack<T>) has some departures from the non-concurrent counterpart. Most notably: Dequeue() was removed in favor of TryDequeue(). Returns true if an item existed and was dequeued and false if empty. Count does not take a snapshot It subtracts the head and tail index to get the count. This results overall in a O(1) complexity which is quite good. It’s still recommended, however, that for empty checks you call IsEmpty instead of comparing Count to zero. ToArray() and GetEnumerator() both take snapshots. This means that iteration over a queue will give you a static view at the time of the call and will not reflect updates. The Enqueue() method on the ConcurrentQueue<T> works much the same as the generic Queue<T>: 1: var queue = new ConcurrentQueue<string>(); 2: 3: // adding to queue is much the same as before 4: queue.Enqueue("First"); 5: queue.Enqueue("Second"); 6: queue.Enqueue("Third"); For front item access, the TryPeek() method must be used to attempt to see the first item if the queue. There is no Peek() method since, as you’ll remember, we can only peek on a non-empty queue, so we must have an atomic TryPeek() that checks for empty and then returns the first item if the queue is non-empty. 1: // to look at first item in queue without removing it, can use TryPeek. 2: // Note that there is no Peek(), this is because you need to check for empty first. TryPeek does. 3: string item; 4: if (queue.TryPeek(out item)) 5: { 6: Console.WriteLine("First item was " + item); 7: } 8: else 9: { 10: Console.WriteLine("Queue was empty."); 11: } Then, to remove items you use TryDequeue(). Once again this is for the same reason we have TryPeek() and not Peek(): 1: // to remove items, use TryDequeue. If queue is empty returns false. 2: if (queue.TryDequeue(out item)) 3: { 4: Console.WriteLine("Dequeued first item " + item); 5: } Just like the concurrent stack, the ConcurrentQueue<T> takes a snapshot when you call ToArray() or GetEnumerator() which means that subsequent updates to the queue will not be seen when you iterate over the results. Thus once again the code below will only show the first item, since the other items were added after the snapshot. 1: var queue = new ConcurrentQueue<string>(); 2: 3: // adding to queue is much the same as before 4: queue.Enqueue("First"); 5: 6: var iterator = queue.GetEnumerator(); 7: 8: queue.Enqueue("Second"); 9: queue.Enqueue("Third"); 10: 11: // only shows First 12: while (iterator.MoveNext()) 13: { 14: Console.WriteLine("Dequeued item " + iterator.Current); 15: } Using collections concurrently You’ll notice in the examples above I stuck to using single-threaded examples so as to make them deterministic and the results obvious. Of course, if we used these collections in a truly multi-threaded way the results would be less deterministic, but would still be thread-safe and with no locking on your part required! For example, say you have an order processor that takes an IEnumerable<Order> and handles each other in a multi-threaded fashion, then groups the responses together in a concurrent collection for aggregation. This can be done easily with the TPL’s Parallel.ForEach(): 1: public static IEnumerable<OrderResult> ProcessOrders(IEnumerable<Order> orderList) 2: { 3: var proxy = new OrderProxy(); 4: var results = new ConcurrentQueue<OrderResult>(); 5: 6: // notice that we can process all these in parallel and put the results 7: // into our concurrent collection without needing any external locking! 8: Parallel.ForEach(orderList, 9: order => 10: { 11: var result = proxy.PlaceOrder(order); 12: 13: results.Enqueue(result); 14: }); 15: 16: return results; 17: } Summary Obviously, if you do not need multi-threaded safety, you don’t need to use these collections, but when you do need multi-threaded collections these are just the ticket! The plethora of features (I always think of the movie The Three Amigos when I say plethora) built into these containers and the amazing way they acheive thread-safe access in an efficient manner is wonderful to behold. Stay tuned next week where we’ll continue our discussion with the ConcurrentBag<T> and the ConcurrentDictionary<TKey,TValue>. For some excellent information on the performance of the concurrent collections and how they perform compared to a traditional brute-force locking strategy, see this wonderful whitepaper by the Microsoft Parallel Computing Platform team here. Tweet Technorati Tags: C#,.NET,Concurrent Collections,Collections,Multi-Threading,Little Wonders,BlackRabbitCoder,James Michael Hare

Read the article
Ad-hoc taxonomy: owning the chess set doesn't mean you decide how the little horsey moves

- by Roger Hart

There was one of those little laugh-or-cry moments recently when I heard an anecdote about content strategy failings at a major online retailer. The story goes a bit like this: successful company in a highly commoditized marketplace succeeds on price and largely ignores its content team. Being relatively entrepreneurial, the founders are still knocking around, and occasionally like to "take an interest". One day, they decree that clothing sold on the site can no longer be described as "unisex", because this sounds old fashioned. Sad now. Let me just reiterate for the folks at the back: large retailer, commoditized market place, differentiating on price. That's inherently unstable. Sooner or later, they're going to need one or both of competitive differentiation and significant optimization. I can't speak for the latter, since I'm hypothesizing off a raft of rumour, but one of the simpler paths to the former is to become - or rather acknowledge that they are - a content business. Regardless, they need highly-searchable terminology. Even in the face of tooth and claw resistance to noticing the fundamental position content occupies in driving sales (and SEO) on the web, there's a clear information problem here. Dilettante taxonomy is a disaster. Ok, so this is a small example, but that kind of makes it a good one. Unisex probably is the best way of describing clothing designed to suit either men or women interchangeably. It certainly takes less time to type (and read). It's established terminology, and as a single word, it's significantly better for web readability than a phrasal workaround. Something like "fits men or women" is short, by could fall foul of clause-level discard in web scanning. It's not an adjective, so for intuitive reading it's never going to be near the start of a title or description. It would also clutter up search results, and impose cognitive load in list scanning. Sorry kids, it's just worse. Even if "unisex" were an archaism (which it isn't), the only thing that would weigh against its being more usable and concise terminology would be evidence that this archaism were hurting conversions. Good luck with that. We once - briefly - called one of our products a "Can of worms". It was a bundle in a bug-tracking suite, and we thought it sounded terribly cool. Guess how well that sold. We have information and content professionals for a reason: to make sure that whatever we put in front of users is optimised to meet user and business goals. If that thinking doesn't inform style guides, taxonomy, messaging, title structure, and so forth, you might as well be finger painting.

Read the article
C#/.NET Little Wonders: The ConcurrentDictionary

- by James Michael Hare

Once again we consider some of the lesser known classes and keywords of C#. In this series of posts, we will discuss how the concurrent collections have been developed to help alleviate these multi-threading concerns. Last week’s post began with a general introduction and discussed the ConcurrentStack<T> and ConcurrentQueue<T>. Today's post discusses the ConcurrentDictionary<T> (originally I had intended to discuss ConcurrentBag this week as well, but ConcurrentDictionary had enough information to create a very full post on its own!). Finally next week, we shall close with a discussion of the ConcurrentBag<T> and BlockingCollection<T>. For more of the "Little Wonders" posts, see the index here. Recap As you'll recall from the previous post, the original collections were object-based containers that accomplished synchronization through a Synchronized member. While these were convenient because you didn't have to worry about writing your own synchronization logic, they were a bit too finely grained and if you needed to perform multiple operations under one lock, the automatic synchronization didn't buy much. With the advent of .NET 2.0, the original collections were succeeded by the generic collections which are fully type-safe, but eschew automatic synchronization. This cuts both ways in that you have a lot more control as a developer over when and how fine-grained you want to synchronize, but on the other hand if you just want simple synchronization it creates more work. With .NET 4.0, we get the best of both worlds in generic collections. A new breed of collections was born called the concurrent collections in the System.Collections.Concurrent namespace. These amazing collections are fine-tuned to have best overall performance for situations requiring concurrent access. They are not meant to replace the generic collections, but to simply be an alternative to creating your own locking mechanisms. Among those concurrent collections were the ConcurrentStack<T> and ConcurrentQueue<T> which provide classic LIFO and FIFO collections with a concurrent twist. As we saw, some of the traditional methods that required calls to be made in a certain order (like checking for not IsEmpty before calling Pop()) were replaced in favor of an umbrella operation that combined both under one lock (like TryPop()). Now, let's take a look at the next in our series of concurrent collections!For some excellent information on the performance of the concurrent collections and how they perform compared to a traditional brute-force locking strategy, see this wonderful whitepaper by the Microsoft Parallel Computing Platform team here. ConcurrentDictionary – the fully thread-safe dictionary The ConcurrentDictionary<TKey,TValue> is the thread-safe counterpart to the generic Dictionary<TKey, TValue> collection. Obviously, both are designed for quick – O(1) – lookups of data based on a key. If you think of algorithms where you need lightning fast lookups of data and don’t care whether the data is maintained in any particular ordering or not, the unsorted dictionaries are generally the best way to go. Note: as a side note, there are sorted implementations of IDictionary, namely SortedDictionary and SortedList which are stored as an ordered tree and a ordered list respectively. While these are not as fast as the non-sorted dictionaries – they are O(log2 n) – they are a great combination of both speed and ordering -- and still greatly outperform a linear search. Now, once again keep in mind that if all you need to do is load a collection once and then allow multi-threaded reading you do not need any locking. Examples of this tend to be situations where you load a lookup or translation table once at program start, then keep it in memory for read-only reference. In such cases locking is completely non-productive. However, most of the time when we need a concurrent dictionary we are interleaving both reads and updates. This is where the ConcurrentDictionary really shines! It achieves its thread-safety with no common lock to improve efficiency. It actually uses a series of locks to provide concurrent updates, and has lockless reads! This means that the ConcurrentDictionary gets even more efficient the higher the ratio of reads-to-writes you have. ConcurrentDictionary and Dictionary differences For the most part, the ConcurrentDictionary<TKey,TValue> behaves like it’s Dictionary<TKey,TValue> counterpart with a few differences. Some notable examples of which are: Add() does not exist in the concurrent dictionary. This means you must use TryAdd(), AddOrUpdate(), or GetOrAdd(). It also means that you can’t use a collection initializer with the concurrent dictionary. TryAdd() replaced Add() to attempt atomic, safe adds. Because Add() only succeeds if the item doesn’t already exist, we need an atomic operation to check if the item exists, and if not add it while still under an atomic lock. TryUpdate() was added to attempt atomic, safe updates. If we want to update an item, we must make sure it exists first and that the original value is what we expected it to be. If all these are true, we can update the item under one atomic step. TryRemove() was added to attempt atomic, safe removes. To safely attempt to remove a value we need to see if the key exists first, this checks for existence and removes under an atomic lock. AddOrUpdate() was added to attempt an thread-safe “upsert”. There are many times where you want to insert into a dictionary if the key doesn’t exist, or update the value if it does. This allows you to make a thread-safe add-or-update. GetOrAdd() was added to attempt an thread-safe query/insert. Sometimes, you want to query for whether an item exists in the cache, and if it doesn’t insert a starting value for it. This allows you to get the value if it exists and insert if not. Count, Keys, Values properties take a snapshot of the dictionary. Accessing these properties may interfere with add and update performance and should be used with caution. ToArray() returns a static snapshot of the dictionary. That is, the dictionary is locked, and then copied to an array as a O(n) operation. GetEnumerator() is thread-safe and efficient, but allows dirty reads. Because reads require no locking, you can safely iterate over the contents of the dictionary. The only downside is that, depending on timing, you may get dirty reads. Dirty reads during iteration The last point on GetEnumerator() bears some explanation. Picture a scenario in which you call GetEnumerator() (or iterate using a foreach, etc.) and then, during that iteration the dictionary gets updated. This may not sound like a big deal, but it can lead to inconsistent results if used incorrectly. The problem is that items you already iterated over that are updated a split second after don’t show the update, but items that you iterate over that were updated a split second before do show the update. Thus you may get a combination of items that are “stale” because you iterated before the update, and “fresh” because they were updated after GetEnumerator() but before the iteration reached them. Let’s illustrate with an example, let’s say you load up a concurrent dictionary like this: 1: // load up a dictionary. 2: var dictionary = new ConcurrentDictionary<string, int>(); 3: 4: dictionary["A"] = 1; 5: dictionary["B"] = 2; 6: dictionary["C"] = 3; 7: dictionary["D"] = 4; 8: dictionary["E"] = 5; 9: dictionary["F"] = 6; Then you have one task (using the wonderful TPL!) to iterate using dirty reads: 1: // attempt iteration in a separate thread 2: var iterationTask = new Task(() => 3: { 4: // iterates using a dirty read 5: foreach (var pair in dictionary) 6: { 7: Console.WriteLine(pair.Key + ":" + pair.Value); 8: } 9: }); And one task to attempt updates in a separate thread (probably): 1: // attempt updates in a separate thread 2: var updateTask = new Task(() => 3: { 4: // iterates, and updates the value by one 5: foreach (var pair in dictionary) 6: { 7: dictionary[pair.Key] = pair.Value + 1; 8: } 9: }); Now that we’ve done this, we can fire up both tasks and wait for them to complete: 1: // start both tasks 2: updateTask.Start(); 3: iterationTask.Start(); 4: 5: // wait for both to complete. 6: Task.WaitAll(updateTask, iterationTask); Now, if I you didn’t know about the dirty reads, you may have expected to see the iteration before the updates (such as A:1, B:2, C:3, D:4, E:5, F:6). However, because the reads are dirty, we will quite possibly get a combination of some updated, some original. My own run netted this result: 1: F:6 2: E:6 3: D:5 4: C:4 5: B:3 6: A:2 Note that, of course, iteration is not in order because ConcurrentDictionary, like Dictionary, is unordered. Also note that both E and F show the value 6. This is because the output task reached F before the update, but the updates for the rest of the items occurred before their output (probably because console output is very slow, comparatively). If we want to always guarantee that we will get a consistent snapshot to iterate over (that is, at the point we ask for it we see precisely what is in the dictionary and no subsequent updates during iteration), we should iterate over a call to ToArray() instead: 1: // attempt iteration in a separate thread 2: var iterationTask = new Task(() => 3: { 4: // iterates using a dirty read 5: foreach (var pair in dictionary.ToArray()) 6: { 7: Console.WriteLine(pair.Key + ":" + pair.Value); 8: } 9: }); The atomic Try…() methods As you can imagine TryAdd() and TryRemove() have few surprises. Both first check the existence of the item to determine if it can be added or removed based on whether or not the key currently exists in the dictionary: 1: // try add attempts an add and returns false if it already exists 2: if (dictionary.TryAdd("G", 7)) 3: Console.WriteLine("G did not exist, now inserted with 7"); 4: else 5: Console.WriteLine("G already existed, insert failed."); TryRemove() also has the virtue of returning the value portion of the removed entry matching the given key: 1: // attempt to remove the value, if it exists it is removed and the original is returned 2: int removedValue; 3: if (dictionary.TryRemove("C", out removedValue)) 4: Console.WriteLine("Removed C and its value was " + removedValue); 5: else 6: Console.WriteLine("C did not exist, remove failed."); Now TryUpdate() is an interesting creature. You might think from it’s name that TryUpdate() first checks for an item’s existence, and then updates if the item exists, otherwise it returns false. Well, note quite... It turns out when you call TryUpdate() on a concurrent dictionary, you pass it not only the new value you want it to have, but also the value you expected it to have before the update. If the item exists in the dictionary, and it has the value you expected, it will update it to the new value atomically and return true. If the item is not in the dictionary or does not have the value you expected, it is not modified and false is returned. 1: // attempt to update the value, if it exists and if it has the expected original value 2: if (dictionary.TryUpdate("G", 42, 7)) 3: Console.WriteLine("G existed and was 7, now it's 42."); 4: else 5: Console.WriteLine("G either didn't exist, or wasn't 7."); The composite Add methods The ConcurrentDictionary also has composite add methods that can be used to perform updates and gets, with an add if the item is not existing at the time of the update or get. The first of these, AddOrUpdate(), allows you to add a new item to the dictionary if it doesn’t exist, or update the existing item if it does. For example, let’s say you are creating a dictionary of counts of stock ticker symbols you’ve subscribed to from a market data feed: 1: public sealed class SubscriptionManager 2: { 3: private readonly ConcurrentDictionary<string, int> _subscriptions = new ConcurrentDictionary<string, int>(); 4: 5: // adds a new subscription, or increments the count of the existing one. 6: public void AddSubscription(string tickerKey) 7: { 8: // add a new subscription with count of 1, or update existing count by 1 if exists 9: var resultCount = _subscriptions.AddOrUpdate(tickerKey, 1, (symbol, count) => count + 1); 10: 11: // now check the result to see if we just incremented the count, or inserted first count 12: if (resultCount == 1) 13: { 14: // subscribe to symbol... 15: } 16: } 17: } Notice the update value factory Func delegate. If the key does not exist in the dictionary, the add value is used (in this case 1 representing the first subscription for this symbol), but if the key already exists, it passes the key and current value to the update delegate which computes the new value to be stored in the dictionary. The return result of this operation is the value used (in our case: 1 if added, existing value + 1 if updated). Likewise, the GetOrAdd() allows you to attempt to retrieve a value from the dictionary, and if the value does not currently exist in the dictionary it will insert a value. This can be handy in cases where perhaps you wish to cache data, and thus you would query the cache to see if the item exists, and if it doesn’t you would put the item into the cache for the first time: 1: public sealed class PriceCache 2: { 3: private readonly ConcurrentDictionary<string, double> _cache = new ConcurrentDictionary<string, double>(); 4: 5: // adds a new subscription, or increments the count of the existing one. 6: public double QueryPrice(string tickerKey) 7: { 8: // check for the price in the cache, if it doesn't exist it will call the delegate to create value. 9: return _cache.GetOrAdd(tickerKey, symbol => GetCurrentPrice(symbol)); 10: } 11: 12: private double GetCurrentPrice(string tickerKey) 13: { 14: // do code to calculate actual true price. 15: } 16: } There are other variations of these two methods which vary whether a value is provided or a factory delegate, but otherwise they work much the same. Oddities with the composite Add methods The AddOrUpdate() and GetOrAdd() methods are totally thread-safe, on this you may rely, but they are not atomic. It is important to note that the methods that use delegates execute those delegates outside of the lock. This was done intentionally so that a user delegate (of which the ConcurrentDictionary has no control of course) does not take too long and lock out other threads. This is not necessarily an issue, per se, but it is something you must consider in your design. The main thing to consider is that your delegate may get called to generate an item, but that item may not be the one returned! Consider this scenario: A calls GetOrAdd and sees that the key does not currently exist, so it calls the delegate. Now thread B also calls GetOrAdd and also sees that the key does not currently exist, and for whatever reason in this race condition it’s delegate completes first and it adds its new value to the dictionary. Now A is done and goes to get the lock, and now sees that the item now exists. In this case even though it called the delegate to create the item, it will pitch it because an item arrived between the time it attempted to create one and it attempted to add it. Let’s illustrate, assume this totally contrived example program which has a dictionary of char to int. And in this dictionary we want to store a char and it’s ordinal (that is, A = 1, B = 2, etc). So for our value generator, we will simply increment the previous value in a thread-safe way (perhaps using Interlocked): 1: public static class Program 2: { 3: private static int _nextNumber = 0; 4: 5: // the holder of the char to ordinal 6: private static ConcurrentDictionary<char, int> _dictionary 7: = new ConcurrentDictionary<char, int>(); 8: 9: // get the next id value 10: public static int NextId 11: { 12: get { return Interlocked.Increment(ref _nextNumber); } 13: } Then, we add a method that will perform our insert: 1: public static void Inserter() 2: { 3: for (int i = 0; i < 26; i++) 4: { 5: _dictionary.GetOrAdd((char)('A' + i), key => NextId); 6: } 7: } Finally, we run our test by starting two tasks to do this work and get the results… 1: public static void Main() 2: { 3: // 3 tasks attempting to get/insert 4: var tasks = new List<Task> 5: { 6: new Task(Inserter), 7: new Task(Inserter) 8: }; 9: 10: tasks.ForEach(t => t.Start()); 11: Task.WaitAll(tasks.ToArray()); 12: 13: foreach (var pair in _dictionary.OrderBy(p => p.Key)) 14: { 15: Console.WriteLine(pair.Key + ":" + pair.Value); 16: } 17: } If you run this with only one task, you get the expected A:1, B:2, ..., Z:26. But running this in parallel you will get something a bit more complex. My run netted these results: 1: A:1 2: B:3 3: C:4 4: D:5 5: E:6 6: F:7 7: G:8 8: H:9 9: I:10 10: J:11 11: K:12 12: L:13 13: M:14 14: N:15 15: O:16 16: P:17 17: Q:18 18: R:19 19: S:20 20: T:21 21: U:22 22: V:23 23: W:24 24: X:25 25: Y:26 26: Z:27 Notice that B is 3? This is most likely because both threads attempted to call GetOrAdd() at roughly the same time and both saw that B did not exist, thus they both called the generator and one thread got back 2 and the other got back 3. However, only one of those threads can get the lock at a time for the actual insert, and thus the one that generated the 3 won and the 3 was inserted and the 2 got discarded. This is why on these methods your factory delegates should be careful not to have any logic that would be unsafe if the value they generate will be pitched in favor of another item generated at roughly the same time. As such, it is probably a good idea to keep those generators as stateless as possible. Summary The ConcurrentDictionary is a very efficient and thread-safe version of the Dictionary generic collection. It has all the benefits of type-safety that it’s generic collection counterpart does, and in addition is extremely efficient especially when there are more reads than writes concurrently. Tweet Technorati Tags: C#, .NET, Concurrent Collections, Collections, Little Wonders, Black Rabbit Coder,James Michael Hare

Read the article
C#/.NET Little Wonders: Constraining Generics with Where Clause

- by James Michael Hare

Back when I was primarily a C++ developer, I loved C++ templates. The power of writing very reusable generic classes brought the art of programming to a brand new level. Unfortunately, when .NET 1.0 came about, they didn’t have a template equivalent. With .NET 2.0 however, we finally got generics, which once again let us spread our wings and program more generically in the world of .NET However, C# generics behave in some ways very differently from their C++ template cousins. There is a handy clause, however, that helps you navigate these waters to make your generics more powerful. The Problem – C# Assumes Lowest Common Denominator In C++, you can create a template and do nearly anything syntactically possible on the template parameter, and C++ will not check if the method/fields/operations invoked are valid until you declare a realization of the type. Let me illustrate with a C++ example: 1: // compiles fine, C++ makes no assumptions as to T 2: template <typename T> 3: class ReverseComparer 4: { 5: public: 6: int Compare(const T& lhs, const T& rhs) 7: { 8: return rhs.CompareTo(lhs); 9: } 10: }; Notice that we are invoking a method CompareTo() off of template type T. Because we don’t know at this point what type T is, C++ makes no assumptions and there are no errors. C++ tends to take the path of not checking the template type usage until the method is actually invoked with a specific type, which differs from the behavior of C#: 1: // this will NOT compile! C# assumes lowest common denominator. 2: public class ReverseComparer<T> 3: { 4: public int Compare(T lhs, T rhs) 5: { 6: return lhs.CompareTo(rhs); 7: } 8: } So why does C# give us a compiler error even when we don’t yet know what type T is? This is because C# took a different path in how they made generics. Unless you specify otherwise, for the purposes of the code inside the generic method, T is basically treated like an object (notice I didn’t say T is an object). That means that any operations, fields, methods, properties, etc that you attempt to use of type T must be available at the lowest common denominator type: object. Now, while object has the broadest applicability, it also has the fewest specific. So how do we allow our generic type placeholder to do things more than just what object can do? Solution: Constraint the Type With Where Clause So how do we get around this in C#? The answer is to constrain the generic type placeholder with the where clause. Basically, the where clause allows you to specify additional constraints on what the actual type used to fill the generic type placeholder must support. You might think that narrowing the scope of a generic means a weaker generic. In reality, though it limits the number of types that can be used with the generic, it also gives the generic more power to deal with those types. In effect these constraints says that if the type meets the given constraint, you can perform the activities that pertain to that constraint with the generic placeholders. Constraining Generic Type to Interface or Superclass One of the handiest where clause constraints is the ability to specify the type generic type must implement a certain interface or be inherited from a certain base class. For example, you can’t call CompareTo() in our first C# generic without constraints, but if we constrain T to IComparable<T>, we can: 1: public class ReverseComparer<T> 2: where T : IComparable<T> 3: { 4: public int Compare(T lhs, T rhs) 5: { 6: return lhs.CompareTo(rhs); 7: } 8: } Now that we’ve constrained T to an implementation of IComparable<T>, this means that our variables of generic type T may now call any members specified in IComparable<T> as well. This means that the call to CompareTo() is now legal. If you constrain your type, also, you will get compiler warnings if you attempt to use a type that doesn’t meet the constraint. This is much better than the syntax error you would get within C++ template code itself when you used a type not supported by a C++ template. Constraining Generic Type to Only Reference Types Sometimes, you want to assign an instance of a generic type to null, but you can’t do this without constraints, because you have no guarantee that the type used to realize the generic is not a value type, where null is meaningless. Well, we can fix this by specifying the class constraint in the where clause. By declaring that a generic type must be a class, we are saying that it is a reference type, and this allows us to assign null to instances of that type: 1: public static class ObjectExtensions 2: { 3: public static TOut Maybe<TIn, TOut>(this TIn value, Func<TIn, TOut> accessor) 4: where TOut : class 5: where TIn : class 6: { 7: return (value != null) ? accessor(value) : null; 8: } 9: } In the example above, we want to be able to access a property off of a reference, and if that reference is null, pass the null on down the line. To do this, both the input type and the output type must be reference types (yes, nullable value types could also be considered applicable at a logical level, but there’s not a direct constraint for those). Constraining Generic Type to only Value Types Similarly to constraining a generic type to be a reference type, you can also constrain a generic type to be a value type. To do this you use the struct constraint which specifies that the generic type must be a value type (primitive, struct, enum, etc). Consider the following method, that will convert anything that is IConvertible (int, double, string, etc) to the value type you specify, or null if the instance is null. 1: public static T? ConvertToNullable<T>(IConvertible value) 2: where T : struct 3: { 4: T? result = null; 5: 6: if (value != null) 7: { 8: result = (T)Convert.ChangeType(value, typeof(T)); 9: } 10: 11: return result; 12: } Because T was constrained to be a value type, we can use T? (System.Nullable<T>) where we could not do this if T was a reference type. Constraining Generic Type to Require Default Constructor You can also constrain a type to require existence of a default constructor. Because by default C# doesn’t know what constructors a generic type placeholder does or does not have available, it can’t typically allow you to call one. That said, if you give it the new() constraint, it will mean that the type used to realize the generic type must have a default (no argument) constructor. Let’s assume you have a generic adapter class that, given some mappings, will adapt an item from type TFrom to type TTo. Because it must create a new instance of type TTo in the process, we need to specify that TTo has a default constructor: 1: // Given a set of Action<TFrom,TTo> mappings will map TFrom to TTo 2: public class Adapter<TFrom, TTo> : IEnumerable<Action<TFrom, TTo>> 3: where TTo : class, new() 4: { 5: // The list of translations from TFrom to TTo 6: public List<Action<TFrom, TTo>> Translations { get; private set; } 7: 8: // Construct with empty translation and reverse translation sets. 9: public Adapter() 10: { 11: // did this instead of auto-properties to allow simple use of initializers 12: Translations = new List<Action<TFrom, TTo>>(); 13: } 14: 15: // Add a translator to the collection, useful for initializer list 16: public void Add(Action<TFrom, TTo> translation) 17: { 18: Translations.Add(translation); 19: } 20: 21: // Add a translator that first checks a predicate to determine if the translation 22: // should be performed, then translates if the predicate returns true 23: public void Add(Predicate<TFrom> conditional, Action<TFrom, TTo> translation) 24: { 25: Translations.Add((from, to) => 26: { 27: if (conditional(from)) 28: { 29: translation(from, to); 30: } 31: }); 32: } 33: 34: // Translates an object forward from TFrom object to TTo object. 35: public TTo Adapt(TFrom sourceObject) 36: { 37: var resultObject = new TTo(); 38: 39: // Process each translation 40: Translations.ForEach(t => t(sourceObject, resultObject)); 41: 42: return resultObject; 43: } 44: 45: // Returns an enumerator that iterates through the collection. 46: public IEnumerator<Action<TFrom, TTo>> GetEnumerator() 47: { 48: return Translations.GetEnumerator(); 49: } 50: 51: // Returns an enumerator that iterates through a collection. 52: IEnumerator IEnumerable.GetEnumerator() 53: { 54: return GetEnumerator(); 55: } 56: } Notice, however, you can’t specify any other constructor, you can only specify that the type has a default (no argument) constructor. Summary The where clause is an excellent tool that gives your .NET generics even more power to perform tasks higher than just the base "object level" behavior. There are a few things you cannot specify with constraints (currently) though: Cannot specify the generic type must be an enum. Cannot specify the generic type must have a certain property or method without specifying a base class or interface – that is, you can’t say that the generic must have a Start() method. Cannot specify that the generic type allows arithmetic operations. Cannot specify that the generic type requires a specific non-default constructor. In addition, you cannot overload a template definition with different, opposing constraints. For example you can’t define a Adapter<T> where T : struct and Adapter<T> where T : class. Hopefully, in the future we will get some of these things to make the where clause even more useful, but until then what we have is extremely valuable in making our generics more user friendly and more powerful! Technorati Tags: C#,.NET,Little Wonders,BlackRabbitCoder,where,generics

Read the article
C#/.NET Little Wonders: Skip() and Take()

- by James Michael Hare

Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. I’ve covered many valuable methods from System.Linq class library before, so you already know it’s packed with extension-method goodness. Today I’d like to cover two small families I’ve neglected to mention before: Skip() and Take(). While these methods seem so simple, they are an easy way to create sub-sequences for IEnumerable<T>, much the way GetRange() creates sub-lists for List<T>. Skip() and SkipWhile() The Skip() family of methods is used to ignore items in a sequence until either a certain number are passed, or until a certain condition becomes false. This makes the methods great for starting a sequence at a point possibly other than the first item of the original sequence. The Skip() family of methods contains the following methods (shown below in extension method syntax): Skip(int count) Ignores the specified number of items and returns a sequence starting at the item after the last skipped item (if any). SkipWhile(Func<T, bool> predicate) Ignores items as long as the predicate returns true and returns a sequence starting with the first item to invalidate the predicate (if any). SkipWhile(Func<T, int, bool> predicate) Same as above, but passes not only the item itself to the predicate, but also the index of the item. For example: 1: var list = new[] { 3.14, 2.72, 42.0, 9.9, 13.0, 101.0 }; 2: 3: // sequence contains { 2.72, 42.0, 9.9, 13.0, 101.0 } 4: var afterSecond = list.Skip(1); 5: Console.WriteLine(string.Join(", ", afterSecond)); 6: 7: // sequence contains { 42.0, 9.9, 13.0, 101.0 } 8: var afterFirstDoubleDigit = list.SkipWhile(v => v < 10.0); 9: Console.WriteLine(string.Join(", ", afterFirstDoubleDigit)); Note that the SkipWhile() stops skipping at the first item that returns false and returns from there to the rest of the sequence, even if further items in that sequence also would satisfy the predicate (otherwise, you’d probably be using Where() instead, of course). If you do use the form of SkipWhile() which also passes an index into the predicate, then you should keep in mind that this is the index of the item in the sequence you are calling SkipWhile() from, not the index in the original collection. That is, consider the following: 1: var list = new[] { 1.0, 1.1, 1.2, 2.2, 2.3, 2.4 }; 2: 3: // Get all items < 10, then 4: var whatAmI = list 5: .Skip(2) 6: .SkipWhile((i, x) => i > x); For this example the result above is 2.4, and not 1.2, 2.2, 2.3, 2.4 as some might expect. The key is knowing what the index is that’s passed to the predicate in SkipWhile(). In the code above, because Skip(2) skips 1.0 and 1.1, the sequence passed to SkipWhile() begins at 1.2 and thus it considers the “index” of 1.2 to be 0 and not 2. This same logic applies when using any of the extension methods that have an overload that allows you to pass an index into the delegate, such as SkipWhile(), TakeWhile(), Select(), Where(), etc. It should also be noted, that it’s fine to Skip() more items than exist in the sequence (an empty sequence is the result), or even to Skip(0) which results in the full sequence. So why would it ever be useful to return Skip(0) deliberately? One reason might be to return a List<T> as an immutable sequence. Consider this class: 1: public class MyClass 2: { 3: private List<int> _myList = new List<int>(); 4: 5: // works on surface, but one can cast back to List<int> and mutate the original... 6: public IEnumerable<int> OneWay 7: { 8: get { return _myList; } 9: } 10: 11: // works, but still has Add() etc which throw at runtime if accidentally called 12: public ReadOnlyCollection<int> AnotherWay 13: { 14: get { return new ReadOnlyCollection<int>(_myList); } 15: } 16: 17: // immutable, can't be cast back to List<int>, doesn't have methods that throw at runtime 18: public IEnumerable<int> YetAnotherWay 19: { 20: get { return _myList.Skip(0); } 21: } 22: } This code snippet shows three (among many) ways to return an internal sequence in varying levels of immutability. Obviously if you just try to return as IEnumerable<T> without doing anything more, there’s always the danger the caller could cast back to List<T> and mutate your internal structure. You could also return a ReadOnlyCollection<T>, but this still has the mutating methods, they just throw at runtime when called instead of giving compiler errors. Finally, you can return the internal list as a sequence using Skip(0) which skips no items and just runs an iterator through the list. The result is an iterator, which cannot be cast back to List<T>. Of course, there’s many ways to do this (including just cloning the list, etc.) but the point is it illustrates a potential use of using an explicit Skip(0). Take() and TakeWhile() The Take() and TakeWhile() methods can be though of as somewhat of the inverse of Skip() and SkipWhile(). That is, while Skip() ignores the first X items and returns the rest, Take() returns a sequence of the first X items and ignores the rest. Since they are somewhat of an inverse of each other, it makes sense that their calling signatures are identical (beyond the method name obviously): Take(int count) Returns a sequence containing up to the specified number of items. Anything after the count is ignored. TakeWhile(Func<T, bool> predicate) Returns a sequence containing items as long as the predicate returns true. Anything from the point the predicate returns false and beyond is ignored. TakeWhile(Func<T, int, bool> predicate) Same as above, but passes not only the item itself to the predicate, but also the index of the item. So, for example, we could do the following: 1: var list = new[] { 1.0, 1.1, 1.2, 2.2, 2.3, 2.4 }; 2: 3: // sequence contains 1.0 and 1.1 4: var firstTwo = list.Take(2); 5: 6: // sequence contains 1.0, 1.1, 1.2 7: var underTwo = list.TakeWhile(i => i < 2.0); The same considerations for SkipWhile() with index apply to TakeWhile() with index, of course. Using Skip() and Take() for sub-sequences A few weeks back, I talked about The List<T> Range Methods and showed how they could be used to get a sub-list of a List<T>. This works well if you’re dealing with List<T>, or don’t mind converting to List<T>. But if you have a simple IEnumerable<T> sequence and want to get a sub-sequence, you can also use Skip() and Take() to much the same effect: 1: var list = new List<double> { 1.0, 1.1, 1.2, 2.2, 2.3, 2.4 }; 2: 3: // results in List<T> containing { 1.2, 2.2, 2.3 } 4: var subList = list.GetRange(2, 3); 5: 6: // results in sequence containing { 1.2, 2.2, 2.3 } 7: var subSequence = list.Skip(2).Take(3); I say “much the same effect” because there are some differences. First of all GetRange() will throw if the starting index or the count are greater than the number of items in the list, but Skip() and Take() do not. Also GetRange() is a method off of List<T>, thus it can use direct indexing to get to the items much more efficiently, whereas Skip() and Take() operate on sequences and may actually have to walk through the items they skip to create the resulting sequence. So each has their pros and cons. My general rule of thumb is if I’m already working with a List<T> I’ll use GetRange(), but for any plain IEnumerable<T> sequence I’ll tend to prefer Skip() and Take() instead. Summary The Skip() and Take() families of LINQ extension methods are handy for producing sub-sequences from any IEnumerable<T> sequence. Skip() will ignore the specified number of items and return the rest of the sequence, whereas Take() will return the specified number of items and ignore the rest of the sequence. Similarly, the SkipWhile() and TakeWhile() methods can be used to skip or take items, respectively, until a given predicate returns false. Technorati Tags: C#, CSharp, .NET, LINQ, IEnumerable<T>, Skip, Take, SkipWhile, TakeWhile

Read the article
C#/.NET Little Wonders: The Useful But Overlooked Sets

- by James Michael Hare

Once again we consider some of the lesser known classes and keywords of C#. Today we will be looking at two set implementations in the System.Collections.Generic namespace: HashSet<T> and SortedSet<T>. Even though most people think of sets as mathematical constructs, they are actually very useful classes that can be used to help make your application more performant if used appropriately. A Background From Math In mathematical terms, a set is an unordered collection of unique items. In other words, the set {2,3,5} is identical to the set {3,5,2}. In addition, the set {2, 2, 4, 1} would be invalid because it would have a duplicate item (2). In addition, you can perform set arithmetic on sets such as: Intersections: The intersection of two sets is the collection of elements common to both. Example: The intersection of {1,2,5} and {2,4,9} is the set {2}. Unions: The union of two sets is the collection of unique items present in either or both set. Example: The union of {1,2,5} and {2,4,9} is {1,2,4,5,9}. Differences: The difference of two sets is the removal of all items from the first set that are common between the sets. Example: The difference of {1,2,5} and {2,4,9} is {1,5}. Supersets: One set is a superset of a second set if it contains all elements that are in the second set. Example: The set {1,2,5} is a superset of {1,5}. Subsets: One set is a subset of a second set if all the elements of that set are contained in the first set. Example: The set {1,5} is a subset of {1,2,5}. If We’re Not Doing Math, Why Do We Care? Now, you may be thinking: why bother with the set classes in C# if you have no need for mathematical set manipulation? The answer is simple: they are extremely efficient ways to determine ownership in a collection. For example, let’s say you are designing an order system that tracks the price of a particular equity, and once it reaches a certain point will trigger an order. Now, since there’s tens of thousands of equities on the markets, you don’t want to track market data for every ticker as that would be a waste of time and processing power for symbols you don’t have orders for. Thus, we just want to subscribe to the stock symbol for an equity order only if it is a symbol we are not already subscribed to. Every time a new order comes in, we will check the list of subscriptions to see if the new order’s stock symbol is in that list. If it is, great, we already have that market data feed! If not, then and only then should we subscribe to the feed for that symbol. So far so good, we have a collection of symbols and we want to see if a symbol is present in that collection and if not, add it. This really is the essence of set processing, but for the sake of comparison, let’s say you do a list instead: 1: // class that handles are order processing service 2: public sealed class OrderProcessor 3: { 4: // contains list of all symbols we are currently subscribed to 5: private readonly List<string> _subscriptions = new List<string>(); 6: 7: ... 8: } Now whenever you are adding a new order, it would look something like: 1: public PlaceOrderResponse PlaceOrder(Order newOrder) 2: { 3: // do some validation, of course... 4: 5: // check to see if already subscribed, if not add a subscription 6: if (!_subscriptions.Contains(newOrder.Symbol)) 7: { 8: // add the symbol to the list 9: _subscriptions.Add(newOrder.Symbol); 10: 11: // do whatever magic is needed to start a subscription for the symbol 12: } 13: 14: // place the order logic! 15: } What’s wrong with this? In short: performance! Finding an item inside a List<T> is a linear - O(n) – operation, which is not a very performant way to find if an item exists in a collection. (I used to teach algorithms and data structures in my spare time at a local university, and when you began talking about big-O notation you could immediately begin to see eyes glossing over as if it was pure, useless theory that would not apply in the real world, but I did and still do believe it is something worth understanding well to make the best choices in computer science). Let’s think about this: a linear operation means that as the number of items increases, the time that it takes to perform the operation tends to increase in a linear fashion. Put crudely, this means if you double the collection size, you might expect the operation to take something like the order of twice as long. Linear operations tend to be bad for performance because they mean that to perform some operation on a collection, you must potentially “visit” every item in the collection. Consider finding an item in a List<T>: if you want to see if the list has an item, you must potentially check every item in the list before you find it or determine it’s not found. Now, we could of course sort our list and then perform a binary search on it, but sorting is typically a linear-logarithmic complexity – O(n * log n) - and could involve temporary storage. So performing a sort after each add would probably add more time. As an alternative, we could use a SortedList<TKey, TValue> which sorts the list on every Add(), but this has a similar level of complexity to move the items and also requires a key and value, and in our case the key is the value. This is why sets tend to be the best choice for this type of processing: they don’t rely on separate keys and values for ordering – so they save space – and they typically don’t care about ordering – so they tend to be extremely performant. The .NET BCL (Base Class Library) has had the HashSet<T> since .NET 3.5, but at that time it did not implement the ISet<T> interface. As of .NET 4.0, HashSet<T> implements ISet<T> and a new set, the SortedSet<T> was added that gives you a set with ordering. HashSet<T> – For Unordered Storage of Sets When used right, HashSet<T> is a beautiful collection, you can think of it as a simplified Dictionary<T,T>. That is, a Dictionary where the TKey and TValue refer to the same object. This is really an oversimplification, but logically it makes sense. I’ve actually seen people code a Dictionary<T,T> where they store the same thing in the key and the value, and that’s just inefficient because of the extra storage to hold both the key and the value. As it’s name implies, the HashSet<T> uses a hashing algorithm to find the items in the set, which means it does take up some additional space, but it has lightning fast lookups! Compare the times below between HashSet<T> and List<T>: Operation HashSet<T> List<T> Add() O(1) O(1) at end O(n) in middle Remove() O(1) O(n) Contains() O(1) O(n) Now, these times are amortized and represent the typical case. In the very worst case, the operations could be linear if they involve a resizing of the collection – but this is true for both the List and HashSet so that’s a less of an issue when comparing the two. The key thing to note is that in the general case, HashSet is constant time for adds, removes, and contains! This means that no matter how large the collection is, it takes roughly the exact same amount of time to find an item or determine if it’s not in the collection. Compare this to the List where almost any add or remove must rearrange potentially all the elements! And to find an item in the list (if unsorted) you must search every item in the List. So as you can see, if you want to create an unordered collection and have very fast lookup and manipulation, the HashSet is a great collection. And since HashSet<T> implements ICollection<T> and IEnumerable<T>, it supports nearly all the same basic operations as the List<T> and can use the System.Linq extension methods as well. All we have to do to switch from a List<T> to a HashSet<T> is change our declaration. Since List and HashSet support many of the same members, chances are we won’t need to change much else. 1: public sealed class OrderProcessor 2: { 3: private readonly HashSet<string> _subscriptions = new HashSet<string>(); 4: 5: // ... 6: 7: public PlaceOrderResponse PlaceOrder(Order newOrder) 8: { 9: // do some validation, of course... 10: 11: // check to see if already subscribed, if not add a subscription 12: if (!_subscriptions.Contains(newOrder.Symbol)) 13: { 14: // add the symbol to the list 15: _subscriptions.Add(newOrder.Symbol); 16: 17: // do whatever magic is needed to start a subscription for the symbol 18: } 19: 20: // place the order logic! 21: } 22: 23: // ... 24: } 25: Notice, we didn’t change any code other than the declaration for _subscriptions to be a HashSet<T>. Thus, we can pick up the performance improvements in this case with minimal code changes. SortedSet<T> – Ordered Storage of Sets Just like HashSet<T> is logically similar to Dictionary<T,T>, the SortedSet<T> is logically similar to the SortedDictionary<T,T>. The SortedSet can be used when you want to do set operations on a collection, but you want to maintain that collection in sorted order. Now, this is not necessarily mathematically relevant, but if your collection needs do include order, this is the set to use. So the SortedSet seems to be implemented as a binary tree (possibly a red-black tree) internally. Since binary trees are dynamic structures and non-contiguous (unlike List and SortedList) this means that inserts and deletes do not involve rearranging elements, or changing the linking of the nodes. There is some overhead in keeping the nodes in order, but it is much smaller than a contiguous storage collection like a List<T>. Let’s compare the three: Operation HashSet<T> SortedSet<T> List<T> Add() O(1) O(log n) O(1) at end O(n) in middle Remove() O(1) O(log n) O(n) Contains() O(1) O(log n) O(n) The MSDN documentation seems to indicate that operations on SortedSet are O(1), but this seems to be inconsistent with its implementation and seems to be a documentation error. There’s actually a separate MSDN document (here) on SortedSet that indicates that it is, in fact, logarithmic in complexity. Let’s put it in layman’s terms: logarithmic means you can double the collection size and typically you only add a single extra “visit” to an item in the collection. Take that in contrast to List<T>’s linear operation where if you double the size of the collection you double the “visits” to items in the collection. This is very good performance! It’s still not as performant as HashSet<T> where it always just visits one item (amortized), but for the addition of sorting this is a good thing. Consider the following table, now this is just illustrative data of the relative complexities, but it’s enough to get the point: Collection Size O(1) Visits O(log n) Visits O(n) Visits 1 1 1 1 10 1 4 10 100 1 7 100 1000 1 10 1000 Notice that the logarithmic – O(log n) – visit count goes up very slowly compare to the linear – O(n) – visit count. This is because since the list is sorted, it can do one check in the middle of the list, determine which half of the collection the data is in, and discard the other half (binary search). So, if you need your set to be sorted, you can use the SortedSet<T> just like the HashSet<T> and gain sorting for a small performance hit, but it’s still faster than a List<T>. Unique Set Operations Now, if you do want to perform more set-like operations, both implementations of ISet<T> support the following, which play back towards the mathematical set operations described before: IntersectWith() – Performs the set intersection of two sets. Modifies the current set so that it only contains elements also in the second set. UnionWith() – Performs a set union of two sets. Modifies the current set so it contains all elements present both in the current set and the second set. ExceptWith() – Performs a set difference of two sets. Modifies the current set so that it removes all elements present in the second set. IsSupersetOf() – Checks if the current set is a superset of the second set. IsSubsetOf() – Checks if the current set is a subset of the second set. For more information on the set operations themselves, see the MSDN description of ISet<T> (here). What Sets Don’t Do Don’t get me wrong, sets are not silver bullets. You don’t really want to use a set when you want separate key to value lookups, that’s what the IDictionary implementations are best for. Also sets don’t store temporal add-order. That is, if you are adding items to the end of a list all the time, your list is ordered in terms of when items were added to it. This is something the sets don’t do naturally (though you could use a SortedSet with an IComparer with a DateTime but that’s overkill) but List<T> can. Also, List<T> allows indexing which is a blazingly fast way to iterate through items in the collection. Iterating over all the items in a List<T> is generally much, much faster than iterating over a set. Summary Sets are an excellent tool for maintaining a lookup table where the item is both the key and the value. In addition, if you have need for the mathematical set operations, the C# sets support those as well. The HashSet<T> is the set of choice if you want the fastest possible lookups but don’t care about order. In contrast the SortedSet<T> will give you a sorted collection at a slight reduction in performance. Technorati Tags: C#,.Net,Little Wonders,BlackRabbitCoder,ISet,HashSet,SortedSet

Read the article
What's wrong with my Objective-C class?

- by zgillis

I am having trouble with my Objective-C code. I am trying to print out all of the details of my object created from my "Person" class, but the first and last names are not coming through in the NSLog method. They are replaced by spaces. Person.h: http://pastebin.com/mzWurkUL Person.m: http://pastebin.com/JNSi39aw This is my main source file: #import <Foundation/Foundation.h> #import "Person.h" int main (int argc, const char * argv[]) { Person *bobby = [[Person alloc] init]; [bobby setFirstName:@"Bobby"]; [bobby setLastName:@"Flay"]; [bobby setAge:34]; [bobby setWeight:169]; NSLog(@"%s %s is %d years old and weighs %d pounds.", [bobby first_name], [bobby last_name], [bobby age], [bobby weight]); return 0; }

Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >