Tuesday, February 22, 2011

Specification Pattern, Entity Framework & LINQ

 

Firstly just to clarify I am going to be talking about the OOP Specification Pattern not the data pattern commonly found in the SID (Shared Information & Data) model.

Much has been said about the specification pattern so I’m not going to go into that, if you want an overview check out these posts:

http://www.lostechies.com/blogs/chrismissal/archive/2009/09/10/using-the-specification-pattern-for-querying.aspx
http://devlicio.us/blogs/jeff_perrin/archive/2006/12/13/the-specification-pattern.aspx

In this post I’m going to demonstrate how you can make use of the specification pattern to query Entity Framework and create reusable, testable query objects and eliminate inline LINQ queries.

The Smell

When I first got started with Entity Framework way back in 2008 when EF was still in it’s infancy we had lot’s of inline LINQ all over the code base and specific methods on our repositories for querying requirements (which any OOP purist will tell you is bad).

We had a service layer method which more or less looked something like:

     public IEnumerable<Product> FindAllActiveProducts(string keyword)
     {
         return productRepository.FindAllActive(keyword); 
     }
And then on our repository a method which looked like:
        public IEnumerable<Product> FindAllActive(string keyword)
        {
            var query = context.CreateObjectSet<Product>().Include("Category");


            var products = from p in query
                           where p.IsActive
                                 && p.Name.Contains(keyword)
                           select p;

            return products.ToList();

        }

Now whilst this is not all bad it does present a few code smells:

  • You end up with lot’s of methods on your repositories to handle different query scenarios
  • There’s no way to test the query in isolation
  • Magic strings for the Include path

I’m not going to touch on whether or not you should be using repositories for this type of query scenario because that’s a whole other topic and a quick Google search will yield many posts debating this very subject. 

However I will say that if you find yourself with lot’s of methods on your repositories that only perform query operations then this is a big code smell.

The Specification Interface

The first step is to define an interface for our Specification.

   public interface ISpecification<T>
   {
       Expression<Func<T, bool>> Predicate { get; }

       IFetchStrategy<T> FetchStrategy { get; set; }

       bool IsSatisifedBy(T entity);
   }

If you’re familiar with this pattern you will notice the addition of two properties Predicate and FetchStrategy.

The Predicate is ultimately what will be using to perform the query. You will notice this is read-only which forces it to be defined within the specification implementation.

The FetchStrategy is an abstraction which defines the child objects that should be retrieved when loading the entity. More on this below.

Fetch Strategy

For those of you who don’t know when you load an entity from EF & other ORMs you can choose to either load just the root properties or load the related entities at the same time. The way to do this is in EF is by using the .Include method on the ObjectQuery.

This works fine, however fetch strategies are likely needed to be reused in different places so having the .Include with magic strings everywhere becomes a real maintenance headache.

In order to alleviate this pain I’ve created an abstraction on the concept. 

   public interface IFetchStrategy<T>
   {
       IEnumerable<string> IncludePaths { get; }

       IFetchStrategy<T> Include(Expression<Func<T, object>> path);

       IFetchStrategy<T> Include(string path);
   }

 

Here is a generic implementation of this IFetchStrategy.

    public class GenericFetchStrategy<T> : IFetchStrategy<T>
    {
        private readonly IList<string> properties;

        public GenericFetchStrategy()
        {
            properties = new List<string>();
        }

        #region IFetchStrategy<T> Members

        public IEnumerable<string> IncludePaths
        {
            get { return properties; }
        }

        public IFetchStrategy<T> Include(Expression<Func<T, object>> path)
        {
            properties.Add(path.ToPropertyName());
            return this;
        }

        public IFetchStrategy<T> Include(string path)
        {
            properties.Add(path);
            return this;
        }

        #endregion
    }

    public static class Extensions
    {
        public static string ToPropertyName<T>(this Expression<Func<T, object>> selector)
        {
            var me = selector.Body as MemberExpression;
            if (me == null)
            {
                throw new ArgumentException("MemberException expected.");
            }

            var propertyName = me.ToString().Remove(0, 2);
            return propertyName;
        }
    }

This is all fairly self explanatory, all it does is maintain a list of the include paths and provides a fluent interface. All the ToPropertyName extension does is take a LINQ expression and returns the name of the property.

Do note however that there is still one Include method that takes a string as the parameter.
This is here to support really deep object hierarchies which can’t be represented as an expression.

You could easily create your own implementations for each scenario e.g. FullProductFetchStrategy and use that, however I tend to define the fetch strategy within the specification itself as you will soon see.

The Specification Implementation

First off we have a base class which contains the basic functionality and implements the ISpecification interface.

    public abstract class SpecificationBase<T> : ISpecification<T>
    {
        protected IFetchStrategy<T> fetchStrategy;
        protected Expression<Func<T, bool>> predicate;

        protected SpecificationBase()
        {
            fetchStrategy = new GenericFetchStrategy<T>();
        }

        public Expression<Func<T, bool>> Predicate
        {
            get { return predicate; }
        }

        public IFetchStrategy<T> FetchStrategy
        {
            get { return fetchStrategy; }
            set { fetchStrategy = value; }
        }

        public bool IsSatisifedBy(T entity)
        {
            return new[] {entity}.AsQueryable().Any(predicate); 
        }
    }

I have given the fetch strategy a getter & setter as it provides a bit of flexibility to the consumers but arguably you could make this read only and force instantiation in the constructor.

Now returning to the original example of finding active products with a similar name to the keyword provided here is the ActiveProductsByNameSpec.

   public class ActiveProductsByNameSpec : SpecificationBase<Product>
   {
       public ActiveProductsByNameSpec(string keyword)
       {
           predicate = p => p.Name.Contains(keyword) && p.IsActive;

           fetchStrategy = new GenericFetchStrategy<Product>().Include(p => p.Category);
       }
   }

As you can see everything is defined in the constructor.

Now you could expose a property for the keyword argument and have a single function which builds the predicate.

My preference however is to do everything in the constructor as it is immediately obvious what the requirements for this class are and by exposing properties you risk having required values not set leading to subtle bugs.

 

Testing the Specification

Testability for me is one the greatest benefits of using this pattern. With deep object graphs LINQ queries can soon grow in size and complexity. More code == More chance of bugs.

I can count 4 different test cases for this spec and here’s how we can test them.

 
        [TestMethod]
        public void When_Product_Not_Active_Predicate_Should_Find_No_Match()
        {
            var product = new Product {IsActive = false, Name = "Resharper"};
            
            var spec = new ActiveProductsByNameSpec("Resharper");

            var actual = spec.IsSatisifedBy(product); 

            Assert.IsFalse(actual);

        }

        [TestMethod]
        public void 
            When_Product_IsActive_But_Does_Not_Contain_Keyword_Predicate_Should_Find_No_Match()
        {
            var product = new Product { IsActive = true, Name = "Visual Studio" };

            var spec = new ActiveProductsByNameSpec("Resharper");

            var actual = spec.IsSatisifedBy(product);

            Assert.IsFalse(actual);

        }

        [TestMethod]
        public void 
            When_Product_Does_Not_Contain_Keyword_And_Is_Not_Active_Predicate_Should_Find_No_Match()
        {
            var product = new Product { IsActive = false, Name = "Visual Studio" };

            var spec = new ActiveProductsByNameSpec("Resharper");

            var actual = spec.IsSatisifedBy(product);

            Assert.IsFalse(actual);
        }

        [TestMethod]
        public void 
            When_Product_IsActive_And_Contains_Keyword_Predicate_Should_Find_Match()
        {
            var product = new Product { IsActive = true, Name = "Resharper" };

            var spec = new ActiveProductsByNameSpec("Resharper");

            var actual = spec.IsSatisifedBy(product);

            Assert.IsTrue(actual);
        }

 

The Generic Repository

Now that we have our spec we need to create a generic repository which takes the ISpecification interface and returns some Entities.

   public interface IGenericQueryRepository
   {
       T Load<T>(ISpecification<T> spec);

       IEnumerable<T> LoadAll<T>(ISpecification<T> spec);

       bool Matches<T>(ISpecification<T> spec);
   }

And this is implemented by:

    public class GenericQueryRepository : IGenericQueryRepository
    {
        private ObjectContext context;

        #region IGenericQueryRepository Members

        public T Load<T>(ISpecification<T> spec)
        {
            var query = GetQuery(spec.FetchStrategy);

            return query.FirstOrDefault(spec.Predicate);
        }

        public IEnumerable<T> LoadAll<T>(ISpecification<T> spec)
        {
            var query = GetQuery(spec.FetchStrategy);

            return query.ToList();
        }

        public bool Matches<T>(ISpecification<T> spec)
        {
            var query = GetQuery(spec.FetchStrategy);

            return query.Any(spec.Predicate);
        }

        #endregion

        private IQueryable<T> GetQuery<T>(IFetchStrategy<T> fetchStrategy)
        {
            ObjectQuery<T> query = context.CreateObjectSet<T>();

            if (fetchStrategy == null)
            {
                return query;
            }

            foreach (var path in fetchStrategy.IncludePaths)
            {
                query = query.Include(path);
            }

            return query;
        }
    }

 

Pulling it all together

Now that we have a generic repository we can safely get rid of the FindAllActive method on our ProductRepository and instead change our service layer to depend on the IGenericQueryRepository and instantiate the specification like so.

       public IEnumerable<Product> FindAllActiveProducts(string keyword)
       {
           var spec = new ActiveProductsByNameSpec(keyword);

           return queryRepository.LoadAll(spec); 
       }

 

Conclusion

Well that’s it. I hope I’ve demonstrated how you can reduce the number of inline LINQ queries and ease the testability of such queries. On my team it is now a rule that all LINQ to Entities queries are defined as Specifications.

The only element of this approach that is specific to Entity Framework is the fetch strategy approach I’ve used and I’m sure this could be easily adapted to fit with other ORMs that support LINQ.

As always I’m happy to hear any feedback and feel free to contact me should you need any clarification.

8 comments:

  1. Any chance of a working project download? I get "Cannot implicitly convert type 'System.Data.Objects.ObjectQuery' to 'System.Data.Objects.ObjectSet'. An explicit conversion exists (are you missing a cast?)" for GenericQueryRepository.

    ReplyDelete
  2. Yep you're right. Good old ReSharper had cleaned it up for me. I have fixed it now.

    This line:

    var query = context.CreateObjectSet();

    Should have been:

    ObjectQuery query = context.CreateObjectSet();

    Will look at getting the source code up within the next week.

    ReplyDelete
  3. I get that same error as well.

    ReplyDelete
  4. You could refactor your ISpecification::IsSatisfiedBy method to type Func and assign the return value of Predicate.Compile()

    ReplyDelete
  5. public IEnumerable LoadAll(ISpecification spec)
    {
    var query = GetQuery(spec.FetchStrategy);

    return query.ToList();
    }
    I believe this return needs to be
    return query.Where(spec.Predicate)

    ReplyDelete