Wednesday, October 3, 2012

Unit Testing and Entity Framework:
The Filth and the Fury

Just recently I've noticed that there appears to be something of a controversy around Unit Testing and Entity Framework. I first came across it as I was Googling around for useful posts on using MOQ in conjunction with EF. I've started to notice the topic more and more and as I have mixed feelings on the subject (that is to say I don't have a settled opinion) I thought I'd write about this and see if I came to any kind of conclusion...

The Setup

It started as I was working on a new project. We were using ASP.NET MVC 3 and Entity Framework with DbContext as our persistence layer. Rather than crowbarring the tests in afterwards the intention was to write tests to support the ongoing development. Not quite test driven development but certainly test supported development. (Let's not get into the internecine conflict as to whether this is black belt testable code or not - it isn't but he who pays the piper etc.) Oh and we were planning to use MOQ as our mocking library.

It was the first time I'd used DbContext rather than ObjectContext and so I thought I'd do a little research on how people were using DbContext with regards to testability. I had expected to find that there was some kind of consensus and an advised way forwards. I didn't get that at all. Instead I found a number of conflicting opinions.

Using the Repository / Unit of Work Patterns

One thread of advice that came out was that people advised using the Repository / Unit of Work patterns as wrappers when it came to making testable code. This is kind of interesting in itself as to the best of my understanding ObjectSet / ObjectContext and DbSet / DbContext are both in themselves implementations of the Repository / Unit of Work patterns. So the advice was to build a Repository / Unit of Work pattern to wrap an existing Repository / Unit of Work pattern.

Not as mad as it sounds. The reason for the extra abstraction is that ObjectContext / DbContext in the raw are not MOQ-able.

Or maybe I'm wrong, maybe you can MOQ DbContext?

No you can't. Well that's not true. You can and it's documented here but there's a "but". You need to be using Entity Frameworks Code First approach; actually coding up your DbContext yourself. Before I'd got on board the project had already begun and we were already some way down the road of using the Database First approach. So this didn't seem to be a go-er really.

The best article I found on testability and Entity Framework was this one by K. Scott Allen which essentially detailed how you could implement the Repository / Unit of Work patterns on top of ObjectSet / ObjectContext. In the end I adapted this to do the same thing sat on top of DbSet / DbContext instead.

With this in place I had me my testable code. I was quite happy with this as it seemed quite intelligible. My new approach looked similar to the existing DbSet / DbContext code and so there wasn't a great deal of re-writing to do. Sorted, right?

Here come the nagging doubts...

I did wonder, given that I found a number of articles about applying the Repository / Unit of Work patterns on top of ObjectSet / ObjectContext that there didn't seem to be many examples to do the same for DbSet / DbContext. (I did find a few examples of this but none that felt satisfactory to me for a variety of reasons.) This puzzled me.

I also started to notice that a 1 man war was being waged against the approach I was using by Ladislav Mrnka. Here are a couple of examples of his crusade:

Ladislav is quite strongly of the opinion that wrapping DbSet / DbContext (and I presume ObjectSet / ObjectContext too) in a further Repository / Unit of Work is an antipattern. To quote him: "The reason why I don’t like it is leaky abstraction in Linq-to-entities queries ... In your test you have Linq-to-Objects which is superset of Linq-to-entities and only subset of queries written in L2O is translatable to L2E". <byTheWay>It's worth looking at Jon Skeets explanation of "leaky abstractions" which he did for TekPub.</byTheWay>

As much as I didn't want to admit it - I have come to the conclusion Ladislav probably has a point for a number of reasons:

1. Just because it compiles and passes unit tests don't imagine that means it works...

Unfortunately, a LINQ query that looks right, compiles and has passing unit tests written for it doesn't necessarily work. You can take a query that fails when executed against Entity Framework and come up with test data that will pass that unit test. As Ladislav rightly points out: LINQ-to-Objects != LINQ-to-Entities.

So in this case unit tests of this sort don't provide you with any security. What you need are integration tests. Tests that run against an instance of the database and demonstrate that LINQ will actually translate queries / operations into valid SQL.

2. Complex queries

You can write some pretty complex LINQ queries if you want. This is made particularly easy if you're using comprehension syntax. Whilst these queries may be simple to write it can be uphill work to generate test data to satisfy this. So much so that at times it can feel you've made a rod for your own back using this approach.

3. Lazy Loading

By default Entity Framework employs lazy loading. This a useful approach which reduces the amount of data that is transported. Sometimes this approach forces you to specify up front if you require a particular entity through use of Include statements. This again doesn't lend itself to testing particularly well.

Where does this leave us?

Having considered all of the above for a while and tried out various different approaches I think I'm coming to the conclusion that Ladislav is probably right. Implementing the Repository / Unit of Work patterns on top of ObjectSet / ObjectContext or DbSet / DbContext doesn't seem a worthwhile effort in the end.

So what's a better idea? I think that in the name of simplicity you might as well have a simple class which wraps all of your Entity Framework code. This class could implement an interface and hence be straightforwardly MOQ-able (or alternatively all methods could be virtual and you could forego the interface). Along with this you should have integration tests in place which test the execution of the actual Entity Framework code against a test database.

Now I should say this approach is not necessarily my final opinion. It seems sensible and practical. I think it is likely to simplify the tests that are written around a project. It will certainly be more reliable than just having unit tests in place.

In terms of the project I'm working on at the moment we're kind of doing this in a halfway house sense. That is to say, we're still using our Repository / Unit of Work wrappers for DbSet / DbContext but where things move away from simple operations we're adding extra methods to our Unit of Work class or Repository classes which wrap this functionality and then testing it using our integration tests.

I'm open to the possibility that my opinion may be modified further. And I'd be very interested to know what other people think on the subject.

Update

It turns out that I'm not alone in thinking about this issue and indeed others have expressed this rather better than me - take a look at Jimmy Bogard's post for an example: http://lostechies.com/jimmybogard/2012/09/20/limiting-your-abstractions/.

Update 2

I've also recently watched the following Pluralsight course by Julie Lerman: http://pluralsight.com/training/Courses/TableOfContents/efarchitecture#efarchitecture-m3-archrepo. In this course Julie talks about different implementations of the Repository and Unit of Work patterns in conjunction with Entity Framework. Julie is in favour of using this approach but in this module she elaborates on different "flavours" of these patterns that you might want to use for different reasons (bounded contexts / reference contexts etc). She makes a compelling case and helpfully she is open enough to say that this a point of contention in the community. At the end of watching this I think I felt happy that our "halfway house" approach seems to fit and seems to work. More than anything else Julie made clear that there isn't one definitively "true" approach. Rather many different but similar approaches for achieving the same goal. Good stuff Julie!

15 comments :

  1. Hey I just came across this when trying to point someone to a helpful example of using Moq with EF. I'm happy that my "balanced" position about designing repositories spoke to you. :) And while I have a lot of respect for him, I couldn't help but laugh at your comment about Ladislav's "one man war" against using repos with EF.

    ReplyDelete
    Replies
    1. Thanks Julie, glad to be useful. :-) Ladislav is certainly not half-hearted in his views! - Though I must say I'm glad of that as if he wasn't so forceful I wonder if I would have thought about the topic as much I ended up doing.

      I may be about to take another look at this actually. I remain involved in a number of projects which are driven database first. Consequently our workflow has been:

      1. Database is updated
      2. DbContext / DbSets are updated using the "Update Model from Database..." functionality in the EDMX.
      3. Make changes in our custom Repository / Unit of Work wrappers for DbSet / DbContext.

      I've recently become aware that it's possible to reverse engineer a Code First DbContext using the Entity Framework Power Tools. So I'm now thinking that it may be possible to switch over to using the following workflow:

      1. Database is updated
      2. DbContext / DbSets are updated using the "Reverse Engineer Code First" functionality from the EF Power Tools.

      I’m hoping this would enable us to bin our custom Repository / Unit of Work wrappers and just employ a MOQ / Fakes for unit tests. (Whilst still having integration tests in place but actually directly using DbContext this time.) I don’t know if this will end up panning out or not – I’m going to give it a try and see how feasible this is…

      If I never mention this again then you can imagine it turned out an abject failure :-)

      Delete
  2. i'm not sure that will be a good workflow. reverse engineer creates classes for every single table in the database and then one big context. With code first, you do have the possiblity of changing the model and the updating the database schema to reflect those schema changes. Is it possible to use that path rather than making the changes to the database?

    ReplyDelete
    Replies
    1. Not quite - all changes happen at a database level first rather than a code level. (This is out of my control unfortunately)

      Though it may be possible to reverse engineer the classes up front, keep only that which I need and then manually keep those classes in line with future database schema changes.

      Provided schema changes aren't extreme (generally they aren't) this is probably a reasonable workflow. I must admit I'm a little fearful as to what a mismatch between my Code-First classes and the database schema might look like but hopefully the EF team have come up with errors that make that sort of mistake relatively easy to diagnose.

      Delete
  3. Thanks for the article John. It's a topic I've been researching lately so found this very interesting :)

    ReplyDelete
    Replies
    1. Thanks Phil. It is the subject that never dies! I've recently joined a new project and so been discussing this topic again with the new team.

      Delete
    2. Hi John,

      I'd be very interested in how you went with the new project and any insights so far with your choice.

      We're currently planning a huge code-base refactoring and I'm a bit conflicted about this subject in particular.

      Thanks!

      Delete
    3. Hi Alex,

      I'm not sure that my insights have developed that much as in the new project we a slightly unusual situation. We've got a legacy database which, at least initially, we're having to build on top of. This legacy database doesn't quite reflect our domain as it should be and so we're starting the slightly painful exercise of establishing our own domain and having a mapping layer that moves us from the database modeling of the data to our domain model.

      I digress.

      In the end I think this is all about trade-offs and choices. One thing I am certain of is that I'd probably only go with the Repository / Unit of Work patterns on top of ObjectSet / ObjectContext or DbSet / DbContext approach if I had full control of the structure of the database. I say that as if you're at the mercy of database structure changes that don't particularly lend themselves to unit testing in a straightforward fashion (step forward navigation properties, lazy loading etc) then you can end up in some painful places.

      Hope that helps.

      Delete
  4. Thanks for the article.
    What do you think about a in-memory database test approach? I think that it could be a good solution to have testing without having to write code to implement the Repository/Unit Of Work pattern that, in case of medium/small projects can be oversized. What do you think about this?
    Thanks.
    Federico

    ReplyDelete
    Replies
    1. Hi Federico,

      Thanks for that. I haven't seen an in-memory database test approach demonstrated so I can't really say. It sounds quite promising but I'd have to see it in action to form a view. I guess the "must have" would be that it performed the exact same operations as an actual database (so LINQ to Entities not LINQ to Objects).

      If you've a good example in mind then do point me to it - I'm interested!

      Delete
  5. Hey, regarding the in-memory database, I was curious if anyone has employed the use of EFFORT in their projects.

    http://effort.codeplex.com/

    I have messed with it a little, and it does work well, but it's currently in Beta, and apparently, when you upgrade to EF6 (which is also in Beta at the moment), you have to build against a reference to Entity Framework itself, which works, but I fear for future implications.

    I'm surprised Microsoft hasn't created anything to streamline this, or have a way to make IEnumerables behave as if they were linq-to-entities. The only problem is the eager loading Include() method would cause many of headaches, which makes me fall back to the in-memory database.

    ReplyDelete
  6. Hi John

    Thanks for this article. I am also researching this as I am involved in a project where we want to use DI container to inject the DbContext, however my problem was how to test our repositories without a full blown integration test.

    I think this article and a few others and also Ladislav's have made me decide that the halfway house is probably the best option for us. I am not sure why Microsoft have not provided the way forward on this matter as it seems to be a bone of contention for many developers.. I digres

    Anyway, thanks for the good work and keep it up

    ReplyDelete
  7. Nice article - exactly the stuff i was debating with others - provides lots of things to think about

    ReplyDelete
  8. Great article, really got me thinking about our implementation.

    ReplyDelete