This is the third of a three part series that deals with the issue of faking test data in the context of a legacy app that was built with Microsoft's Entity Framework (EF) on top of an MS SQL Server database – a scenario that can be found very often. Please read the first part for a description of the sample application, a discussion of some general aspects of unit testing in a database context, and of some more specific aspects of the here discussed EF/MSSQL combination.
Lately, I wondered how you would ‘mock’ the data layer of a legacy application, when this data layer is made up of an MS Entity Framework (EF) model in combination with a MS SQL Server database. Originally, this question came up in the context of how you could enable higher-level integration tests (automated UI tests, to be exact) for a legacy application that uses this EF/MSSQL combo as its data store mechanism – a not so uncommon scenario.
The question sparked my interest, and I decided to dive into it somewhat deeper. What I've found out is, in short, that it's not very easy and straightforward to do it – but it can be done. The two strategies that are best suited to fit the bill involve using either the (commercial) Typemock Isolator tool or the (free) NDbUnit framework. The use of Typemock was discussed in the previous post, this post now will present the NDbUnit approach...
NDbUnit is an Apache 2.0-licensed open-source project, and like so many other Nxxx tools and frameworks, it is basically a C#/.NET port of the corresponding Java version (DbUnit namely). In short, it helps you in flexibly managing the state of a database in that it lets you easily perform basic operations (like e.g. Insert, Delete, Refresh, DeleteAll)  against your database and, most notably, lets you feed it with data from external xml files.
Let's have a look at how things can be done with the help of this framework.
Preparing the test data
Compared to Typemock, using NDbUnit implies a totally different approach to meet our testing needs.  So the here described testing scenario requires an instance of an SQL Server database in operation, and it also means that the Entity Framework model that sits on top of this database is completely unaffected.
First things first: For its interactions with the database, NDbUnit relies on a .NET Dataset xsd file. See Step 1 of their Quick Start Guide for a description of how to create one. With this prerequisite in place then, the test fixture's setup code could look something like this:
[TestFixture, TestsOn(typeof(PersonRepository))]     
[Metadata("NDbUnit Quickstart URL",     
          "http://code.google.com/p/ndbunit/wiki/QuickStartGuide")]     
[Description("Uses the NDbUnit library to provide test data to a local database.")]     
public class PersonRepositoryFixture     
{     
    #region Constants     
    private const string XmlSchema = @"..\..\TestData\School.xsd";     
    #endregion // Constants     
    #region Fields     
    private SchoolEntities _schoolContext;     
    private PersonRepository _personRepository;     
    private INDbUnitTest _database;     
    #endregion // Fields     
    #region Setup/TearDown     
    [FixtureSetUp]     
    public void FixtureSetUp()     
    {     
        var connectionString = ConfigurationManager.ConnectionStrings["School_Test"].ConnectionString;     
        _database = new SqlDbUnitTest(connectionString);     
        _database.ReadXmlSchema(XmlSchema);     
        var entityConnectionStringBuilder = new EntityConnectionStringBuilder     
        {     
            Metadata = "res://*/School.csdl|res://*/School.ssdl|res://*/School.msl",     
            Provider = "System.Data.SqlClient",     
            ProviderConnectionString = connectionString     
        };     
        _schoolContext = new SchoolEntities(entityConnectionStringBuilder.ConnectionString);     
        _personRepository = new PersonRepository(this._schoolContext);     
    }     
    [FixtureTearDown]     
    public void FixtureTearDown()     
    {     
        _database.PerformDbOperation(DbOperationFlag.DeleteAll);     
        _schoolContext.Dispose();     
    }     
    ...
 As you can see, there is slightly more fixture setup code involved if your tests are using NDbUnit to provide the test data: Because we're dealing with a physical database instance here, we first need to pick up the test-specific connection string from the test assemblies' App.config, then initialize an NDbUnit helper object with this connection along with the provided xsd file, and also set up the SchoolEntities and the PersonRepository instances accordingly.
The _database field (an instance of the INdUnitTest interface) will be our single access point to the underlying database: We use it to perform all the required operations against the data store. To have a flexible mechanism to easily insert data into the database, we can write a helper method like this:
private void InsertTestData(params string[] dataFileNames)     
{     
    _database.PerformDbOperation(DbOperationFlag.DeleteAll);     
    if (dataFileNames == null)     
    {     
        return;     
    }     
    try     
    {     
        foreach (string fileName in dataFileNames)     
        {     
            if (!File.Exists(fileName))     
            {     
                throw new FileNotFoundException(Path.GetFullPath(fileName));     
            }     
            _database.ReadXml(fileName);     
            _database.PerformDbOperation(DbOperationFlag.InsertIdentity);     
        }     
    }     
    catch     
    {     
        _database.PerformDbOperation(DbOperationFlag.DeleteAll);     
        throw;     
    }     
}
This lets us easily insert test data from xml files, in any number and in a  controlled order (which is important because we eventually must fulfill referential constraints, or we must account for some other stuff that imposes a specific ordering on data insertion). Again, as with Typemock, I won't go into API details here. - Unfortunately, there isn't too much documentation for NDbUnit anyway, other than the already mentioned Quick Start Guide (and the source code itself, of course) - a not so uncommon problem with smaller Open Source Projects.
Last not least, we need to provide the required test data in xml form. A snippet for data from the People table might look like this, for example:
<?xml version="1.0" encoding="utf-8" ?>     
<School xmlns="http://tempuri.org/School.xsd">     
  <Person>     
    <PersonID>1</PersonID>     
    <LastName>Abercrombie</LastName>     
    <FirstName>Kim</FirstName>     
    <HireDate>1995-03-11T00:00:00</HireDate>     
  </Person>     
  <Person>     
    <PersonID>2</PersonID>     
    <LastName>Barzdukas</LastName>     
    <FirstName>Gytis</FirstName>     
    <EnrollmentDate>2005-09-01T00:00:00</EnrollmentDate>     
  </Person>     
  <Person>     
    ...
You can also have data from various tables in one single xml file, if that's appropriate for you (but beware of the already mentioned ordering issues). It's true that your test assembly may end up with dozens of such xml files, each containing quite a big amount of text data. But because the files are of very low complexity, and with the help of a little bit of Copy/Paste and Excel magic, this appears to be well manageable.
Executing some basic tests
Here are some of the possible tests that can be written with the above preparations in place:
private const string People = @"..\..\TestData\School.People.xml";     
...     
[Test, MultipleAsserts, TestsOn("PersonRepository.GetNameList")]     
public void GetNameList_ListOrdering_ReturnsTheExpectedFullNames()     
{     
    InsertTestData(People);     
    List<string> names =     
        _personRepository.GetNameList(NameOrdering.List);     
    Assert.Count(34, names);     
    Assert.AreEqual("Abercrombie, Kim", names.First());     
    Assert.AreEqual("Zheng, Roger", names.Last());     
}     
[Test, MultipleAsserts, TestsOn("PersonRepository.GetNameList")]     
[DependsOn("RemovePerson_CalledOnce_DecreasesCountByOne")]     
public void GetNameList_NormalOrdering_ReturnsTheExpectedFullNames()     
{     
    InsertTestData(People);     
    List<string> names =     
        _personRepository.GetNameList(NameOrdering.Normal);     
    Assert.Count(34, names);     
    Assert.AreEqual("Alexandra Walker", names.First());     
    Assert.AreEqual("Yan Li", names.Last());     
}     
[Test, TestsOn("PersonRepository.AddPerson")]     
public void AddPerson_CalledOnce_IncreasesCountByOne()     
{     
    InsertTestData(People);     
    int count = _personRepository.Count;     
    _personRepository.AddPerson(new Person { FirstName = "Thomas", LastName = "Weller" });     
    Assert.AreEqual(count + 1, _personRepository.Count);     
}     
[Test, TestsOn("PersonRepository.RemovePerson")]     
public void RemovePerson_CalledOnce_DecreasesCountByOne()     
{     
    InsertTestData(People);     
    int count = _personRepository.Count;     
    _personRepository.RemovePerson(new Person { PersonID = 33 });     
    Assert.AreEqual(count - 1, _personRepository.Count);     
}
Not much difference here compared to the corresponding Typemock versions, except that we had to do a bit more preparational work (and also it was harder to get the required knowledge). But this picture changes quite dramatically if we look at some more demanding test cases:
Ok, and what if things are becoming somewhat more complex?
Tests like the above ones represent the 'easy' scenarios. They may account for the biggest portion of real-world use cases of the application, and they are important to make sure that it is generally sound. But usually, all these nasty little bugs originate from the more complex parts of our code, or they occur when something goes wrong.
So, for a testing strategy to be of real practical use, it is especially important to see how easy or difficult it is to mimick a scenario which represents a more complex or exceptional case. The following test, for example, deals with the case that there is some sort of invalid input from the caller:
[Test, MultipleAsserts, TestsOn("PersonRepository.GetCourseMembers")]     
[Row(null, typeof(ArgumentNullException))]     
[Row("", typeof(ArgumentException))]     
[Row("NotExistingCourse", typeof(ArgumentException))]     
public void GetCourseMembers_WithGivenVariousInvalidValues_Throws(string courseTitle, Type expectedInnerExceptionType)     
{     
    var exception = Assert.Throws<RepositoryException>(() =>     
                                _personRepository.GetCourseMembers(courseTitle));     
    Assert.IsInstanceOfType(expectedInnerExceptionType, exception.InnerException);     
}
Apparently, this test doesn't need an 'Arrange' part at all (see here for the same test with the Typemock tool). It acts just like any other client code, and all the required business logic comes from the database itself. This doesn't always necessarily mean that there is less complexity, but only that the complexity happens in a different part of your test resources (in the xml files namely, where you sometimes have to spend a lot of effort for carefully preparing the required test data). 
Another example, which relies on an underlying 1-n relationship, might be this:
[Test, MultipleAsserts, TestsOn("PersonRepository.GetCourseMembers")]     
public void GetCourseMembers_WhenGivenAnExistingCourse_ReturnsListOfStudents()     
{     
    InsertTestData(People, Course, Department, StudentGrade);     
    List<Person> persons = _personRepository.GetCourseMembers("Macroeconomics");     
    Assert.Count(4, persons);     
    Assert.ForAll(     
        persons,     
        @p => new[] { 10, 11, 12, 14 }.Contains(@p.PersonID),     
        "Person has none of the expected IDs.");     
}
If you compare this test to its corresponding Typemock version, you immediately see that the test itself is much simpler, easier to read, and thus much more intention-revealing. The complexity here lies hidden behind the call to the InsertTestData() helper method and the content of the used xml files with the test data. And also note that you might have to provide additional data which are not even directly relevant to your test, but are required only to fulfill some integrity needs of the underlying database.
Conclusion
The first thing to notice when comparing the NDbUnit approach to its Typemock counterpart obviously deals with performance: Of course, NDbUnit is much slower than Typemock. Technically,  it doesn't even make sense to compare the two tools. But practically, it may well play a role and could or could not be an issue, depending on how much tests you have of this kind, how often you run them, and what role they play in your development cycle.
Also, because the dataset from the required xsd file must fully match the database schema (even in parts that otherwise wouldn't be relevant to you), it can be quite cumbersome to be in a team where different people are working with the database in parallel.
My personal experience is – as already said in the first part – that Typemock gives you a better development experience in a 'dynamic' scenario (when you're working in some kind of TDD-style, you're oftentimes executing the tests from your dev box, and your database schema changes frequently), whereas the NDbUnit approach is a good and solid solution in more 'static' development scenarios (when you need to execute the tests less frequently or only on a separate build server, and/or the underlying database schema can be kept relatively stable), for example some variations of higher-level integration or User-Acceptance tests.
But in any case, opening Entity Framework based applications for testing requires a fair amount of resources, planning, and preparational work – it's definitely not the kind of stuff that you would call 'easy to test'. Hopefully, future versions of EF will take testing concerns into account. Otherwise, I don't see too much of a future for the framework in the long run, even though it's quite popular at the moment...
The sample solution
A sample solution (VS 2010) with the code from this article series is available via my Bitbucket account from here (Bitbucket is a hosting site for Mercurial repositories. The repositories may also be accessed with the Git and Subversion SCMs - consult the documentation for details. In addition, it is possible to download the solution simply as a zipped archive – via the 'get source' button on the very right.). The solution contains some more tests against the PersonRepository class, which are not shown here. Also, it contains database scripts to create and fill the School sample database.
To compile and run, the solution expects the Gallio/MbUnit framework to be installed (which is free and can be downloaded from here), the NDbUnit framework (which is also free and can be downloaded from here), and the Typemock Isolator tool (a fully functional 30day-trial is available here). Moreover, you will need an instance of the Microsoft SQL Server DBMS, and you will have to adapt the connection strings in the test projects App.config files accordingly.