Friday, April 26, 2013

C# Snippet Sharing with Roslyn and SignalR

I've begun creating an Azure app that will take sharing C# to a new level

You can check out the initial plan for it here

http://www.codeproject.com/Articles/583477/Csharp-Snippet-Sharing-with-Roslyn-and-SignalR

I've had the idea for a while, but decided to put it in writing and enter the idea into the Windows Azure contest currently going on at CodeProject

Monday, April 1, 2013

Removing excess lambda expressions with a Roslyn Visual Studio plugin

I've created a Roslyn application that will let us write LINQ statements and make sure that we don't copy and paste them in the same class

The plugin, called 'Refactor Lambdas' does the following :

  • Looks for duplicated lambda expressions
  • Once detected, breaks them out into a separate function
  • Names the function something based on what the function is doing
  • Has some capacity to do fuzzy matching (if 2 lambdas are both the same logically but use variables  from the outer scope with different names and the same type, it can still refactor those 2 lambda expressions)


Showing may be easier than explaining in this situation.  Here's how the plugin would react to a scaffolded controller from the MVC scaffolding project (A gallery of images is below if you don't have Flash)





Here's a link to the repository on BitBucket


I would enjoy having this as apart of ReSharper so I think it's actually a worthwhile plugin.  Also you could get really crazy and search the entire codebase, with a fuzzy search, and condense all duplicated lambdas.  It would have to be torn apart but I think this is a good end-goal to have with this project.

Some Todos in case anyone is interested.  This would be a good project for someone to get their feet with in open source :

  • Unit tests are good but need a little work.  I could not find an easy way to unit test without specifying cursor position in each test; yuck.  Still has >90% test coverage and does all testing in memory.  Could use some negative tests.
  • Fuzzy matching could be improved when deciding if 2 lambdas are equivalent. 
  • Make it work on files with multiple classes (not really super important--right now it just takes the first class and does the search there)
  • Speed considerations?  Any way to detect this code issue more efficiently?  
I'll release this as a Visual Studio plugin on the gallery once Roslyn is finished and pushed out to Visual Studio 2012 and is enabled by default.  Comments?

Monday, March 18, 2013

Which ASP.NET MVC validation strategy should we use?

At least as long as programming has existed, people have tried to be tricky.

I'm sure we have all tried to architecture something tricky only to have to revert our project and sheepishly go with a simpler solution.

Whenever I look at something that isn't a tricky optimization, it's almost always a tricky way to waste cpu cycles.

The lure of trickyness we'll be looking at is validation in ASP.NET MVC.  Let's look at 3 methods of validation.

The bad way (using model objects)


Unless you want to get hacked in the same fashion github did last year, don't do the following.  You leave yourself open to getting model injected if you do this; only new MVC users will try using model objects bound to a view.  The initial lure might be that you can do this without creating an extra ViewModel.


It's not an extensible way to write code anyway so luckily you have an excuse to not be lazy.  The second you need to use the same domain object on multiple pages for even slightly different purposes, this pattern falls apart and subsequent pages will have trouble accommodating new functionality.

Add a field like this and now we have a security problem because model binding is blind and will greedily bind anything it can.
Actually the above example is prone to being insecure out of the box.  If you only wanted to let the user update their email and password, you'd have to specifically guard the Username and Id (and any new fields as I point out above).

It also shows how unclear the above code is--you may have assumed I was taking the Id and Username from a submitted field even though this was for a profile editing controller.  It's even less readable.


The tricky way (using centralized validation close to your model)


Okay, fine we'll use ViewModels, but we're going to be tricky.  We're going to keep all of our validation inside our domain objects.  When someone goes to input some data, we'll map to our domain objects, check the validation rules and then bubble the results up.  It's genius.

Actually it's bad.  For several reasons.  Exceptions to rules will be painful.

Let's not say things we don't mean and have to take them back


It also doesn't take advantage of the model state of MVC which I find to be quite nice and allows you to check errors separately.


  • Once for easy validations like max length and whether a field is required, 
  • Again for dynamic validations that involve costly database operations.


Bundling those two operations isn't good.  Without a simple way to perform client-side checks, that's what you'll be doing (I would just bundle them on the server if you have jQuery validate on the client for the sake of saving a few lines of code in each controller).  .

I don't care what horrid path Visual Studio Magazine (scroll to 'Integrating annotations') is trying to send you down.  Don't do the tricky way.  It's cute little examples like this have people being tricky.  But then you feed tricky after midnight and bam...code gremlins.  They even seem to recommend 'The bad way'.

This pattern could work in a fully back-end scenario, but this next pattern is just too durable, efficient and fast for anything else to be used for ASP.NET MVC development.

The smart way (also the easiest way -- using ViewModels and validation attributes)



A good example.  Simple and gets created in the default MVC4 project template.


But author, creating a view model AND validation attributes for each form is more typing.  How is it the easiest way?  Because ordering your thoughts is hard.  Not being able to trigger your validation on the client, is hard.  Having to mess with your DbContext just validate user input (noise++ to your codebase) is hard.  Having to make an exception to what you thought were your domain rules becomes a hack (I'll remove this one validation rule..)

Let me be clear about that last point.  Adding something and then removing it later down the pipeline before you do any real work is a HUGE anti-pattern.  I don't know this anti-pattern by a name, but this is one of the biggest mistakes I see in code.  It's unclear, it's a hack, it's inefficient, it shouldn't be done.


Here is the article to read that will explain the best practices of the ViewModel pattern.   (Although the author doesn't mention that your HTML will be rendered with jQuery validate baked into it which is a big plus)


It's worth noting that this works great with ajax heavy sites.  I almost exclusively use ajax to submit forms and working with data annotation attributes is still preferred.  Data annotation validators will cover any  non-dynamic validations you need.  The jQuery validate attributes will be rendered in your HTML which you can easily call out to yourself (a huge benefit over the previous example--an easy way to make super responsive pages by having automatic client side validation.)

The ViewModel pattern solves so many issues.  It's not completely D.R.Y., though.

Or is it?

If you have two pages that allow modification of a profile (lets say, registration and profile updating) you will have to copy and paste your name length validator attribute to both view models.  I would say that attributes are data and not code (we've been blurring the line lately.. look at expressions) and that the code is still D.R.Y.

Conclusion

Most MVC developers already know, and practice, everything I've said above.  So why would I spend my time writing about it?   Because why not examine the other avenues and possible architectures even if they end up being poor choices?  If you ever felt like maybe you were missing something or you were wasting time when writing a bunch of ViewModels and repeating attributes on them, well, you should feel a bit more justified after reading this.  It's the standard--keep doing it.

Also, hasn't everyone tried conquering validation with a tricky pattern at some point?  I think the answer is probably yes.  But at the end of the day, being as specific as you can and failing as soon as you can (Client passes -> server side checks passes -> dynamic server side checks pass -> okay go), is always going to be king and it does not facilitate tricky patterns.

This post may leave you thinking about unit testing as much as possible because there are some maintainability considerations with validations attributes on ViewModels even if this even if it is the best way to write your MVC application.  What do you think?

Monday, March 4, 2013

LINQKit is good and more people should use it

If I had to pick one library that is criminally underused in the C#/.NET world, it would be LINQKit.  A library for making it easier to manipulate and work with expressions (most often used in conjunction with LINQ).

I think part of the reason it's underused is because it's basically 100% complete, a few years old, and you can easily live without it; though I'll try to convince you that you don't want to.  LINQKit is boring because it does one thing and its been doing it for years already.  But it is horribly underused from what I've seen.

LINQKit is has become almost an automatic addition to .NET projects I start.  Most of the time I start a medium sized or bigger C# project, I 'install-package LINQKit'

Which helps bring me to my next point.  I'd make the bold statement that if you have an app that's >50,000 lines of code and it's heavy on data access/Entity Framework/lambda expressions, chances are your codebase could be cleaned up and DRYed out if you used LINQKit.  I decided to write this article because  few people seem to use it and I have a friend working on a commercial project that would be cleaned up by LINQKit because it's very expression heavy (unfortunately one of the project requirements is no 3rd party libraries).  It has me wondering how many coders are unaware of this awesome library.


Only 13,000 NuGet downloads.  1.2 Million for Entity Framework!

I considered making a code example for this post but decided against it.  There are many of short examples that will illustrate the usefulness to you.

Here are a few links with convincing examples of LinqKit usage

Friday, March 1, 2013

A brief Entity Framework Codebase Overview

Open source enterprise projects are a funny thing.  Enterprise software leaks seem to get read more often than enterprise open source software; once you willingly release your software, people seem to stop being infatuated with it unless they plan to modify the code.

Few people seem to actually read a portion of it just for the sake of reading it; especially on massive projects.

I couldn't find a single article that gave an overview of what's going on inside the Entity Framework project so I'm hoping to give some objective insight on what I found and point out some of the more interesting things along the way.

Solution Overview

(As of commit a88645f8581c)
Visual Studio 2012
Language : C# with some tests in VB.NET
Lines of Code : 188,547 (145,126 not counting tests)
Projects : 10


Test Projects : 4 (2 Functional, 1 Unit, 1 VB.NET)
Test Framework : xUnit (No test initialize supported or test class constructors used)

Solution Build Time : 29.26 seconds

Test count, run time


Unit : 4713 tests ran in 233.59 seconds
Functional : 3541 tests ran in 822.97 seconds
Functional.Transitional : 1865 tests ran in 344.25 seconds
VB.NET : 47 tests, ran in 6.28 seconds

(All Done on quad core i7 3.4 GHz with TestDriven.Net)

FxCop rules Standards : Microsoft Managed Recommended rules and a custom ruleset for the EntityFramework assembly.
FxCop Rules suppressions : 2,345

Some Code Overview


How are are the nuances of every different SQL server version handled?  This was one of my biggest curiosities when I opened this project.  Take a second to postulate.

The answer?  Lots of inline version checks and great test coverage.  Here's some code that has to wrangle multiple versions of SQL server.

Fun fact : SQL Server 2008 is still referred to as its codename "Katmai" in comments and method names all over the codebase.  Same goes for SQL Server 2005 "Yukon"

Here are the accompanying tests for the above code that cover all SQL server versions.  (Interesting that they chose underscores to separate words in test names over camel casing.)


Part of the coding standard is to use 'var' in local variable declaration wherever possible. Not uncommon and apart of my standards as well.


I noticed some copied and pasted classes even when both copied classes are in projects that have a common dependency on the EntityFramework assembly.  This is likely no problem for a team with rigorous reviewing standards but it could be a bit nicer. Don't take this  as "The entire codebase is copied and pasted"; it's really just something that piqued my interest.



The new Analyze Solution for Code Clones functionality in Visual Studio 2012 rocks, by the way.  It's still not as good or as fast as Atomiq (pictured above) but it's worth trying out.

Conclusions 


The code is extremely well written.  The test coverage is fantastic and the entire codebase looks like it was coded by a single super-competent programmer with how consistent the coding standard is.  Interesting that they went with xUnit and didn't put any code in test constructors (xUnit does not support test initialization so if you're using it with constructors, you may as well just use nUnit). It seems to make a lot of sense with how big some of the test classes got.

The 2000+ suppressed rules feels like overkill (a good number are targeted at defending variable spelling) It feels counterproductive to say "We are going to adhere to this ruleset" and then make 2,345 exceptions.

Taking suppression fire
A good number of these without the justification parameter filled in.  Finding an instance where a catch all exception handler wasn't explained in comment or justification is discouraging.  You will see the most rigorous FxCop ruleset applied in many Microsoft open source projects.

I hope this gets people excited about a great project and supplied you with a very brief introduction.  I may do a part two that actually goes a bit more in depth if there is an interest.


Tuesday, February 26, 2013

Slides + Code from LINQ Presentation



You can get the accompanying source from the presentation here (includes solution and project, just browse upward in the directory).

In the 10 examples you learn how to

1.  Create a query
2.  Create a query and have it produce a result
3.  Treat an expression predicate as a value
4.  Compile an expression tree to IL
5.  Use LINQ to objects
6.  Build a dynamic expression tree
7.  Compose Queries

Possible future additions : Discuss yield keyword, discuss LINQKit for composability

Code for main program embedded below as well



   1:  using System;
   2:  using System.Collections.Generic;
   3:  using System.Linq;
   4:  using System.Text;
   5:  using System.Threading.Tasks;
   6:  using System.Linq.Expressions;
   7:   
   8:  namespace PresentationV12
   9:  {
  10:      class Program
  11:      {
  12:          static void Main(string[] args)
  13:          {
  14:              #region Example 1 LINQ Query
  15:              using (var repository = new Database1Entities())
  16:              {
  17:                  var query = repository.Users.Where(user => user.UserId == 1);
  18:              }
  19:              #endregion
  20:   
  21:              #region Example 2 LINQ With Result
  22:   
  23:              using (var repository = new Database1Entities())
  24:              {
  25:                  var usr = repository.Users.Where(user => user.Username.EndsWith("Brian")).FirstOrDefault();
  26:              }
  27:              #endregion
  28:   
  29:              #region Example 3 Expression
  30:   
  31:              using (var repository = new Database1Entities())
  32:              {
  33:                  Expression<Func<User, bool>> expression = user => user.UserId != 5;
  34:   
  35:                  //Entity framework does similar parsing to the below and translates the expression to SQL
  36:                  ParameterExpression param = (ParameterExpression)expression.Parameters[0];
  37:                  BinaryExpression operation = (BinaryExpression)expression.Body;
  38:                  var left = operation.Left;
  39:                  var right = operation.Right;
  40:   
  41:                  var usr = repository.Users.Where(expression).FirstOrDefault();
  42:              }
  43:              #endregion
  44:   
  45:              #region Example 4 Not using Expressions - LINQ To objects
  46:              using (var repository = new Database1Entities())
  47:              {
  48:                  //Extension for IQueryable
  49:                  var usrs = repository.Users.Where(user => user.UserId != 5).ToList();
  50:   
  51:                  //Different extension method, IEnumerable.  
  52:                  var user2ndQuery = usrs.Where(user => user.UserId != 6);
  53:              }
  54:              #endregion
  55:   
  56:              #region Example 5 Compiling Expression Tree
  57:   
  58:              using (var repository = new Database1Entities())
  59:              {
  60:                  Expression<Func<User, bool>> expression = user => user.UserId != 5;
  61:   
  62:                  var usrs = repository.Users.ToList();
  63:   
  64:                  //Can't query objects with Expression<Func<User, bool>> expression
  65:                  //Error - var user2ndQuery = usrs.Where(expression);
  66:   
  67:                  //But you can reuse that expression like this to do that v
  68:                  var compiledExpression = expression.Compile();
  69:                  var user2ndQuery = usrs.Where(compiledExpression);
  70:              }
  71:              #endregion
  72:   
  73:              #region Example 6 Building a dynamic Expression tree 
  74:    
  75:              var param1 = Expression.Parameter(typeof(User), "user"); 
  76:              var prop = Expression.Property(param1, "UserId");
  77:              var Right = Expression.Constant(5);
  78:              BinaryExpression expr = Expression.MakeBinary(ExpressionType.NotEqual, prop, Right);
  79:   
  80:              var userExpression = Expression.Lambda<Func<User, bool>>(
  81:                  expr,
  82:                  param1
  83:                  ).Compile();
  84:   
  85:              using (var repository = new Database1Entities())
  86:              {
  87:                  var usr = repository.Users.Where(userExpression).ToList();
  88:              }
  89:   
  90:              #endregion
  91:   
  92:              #region My Number 1 Tip for writing good LINQ
  93:              //Compose your queries so you don't repeat yourself
  94:              //Lack of easy composability in SQL is the worst crux of writing SQL
  95:              //If you're copying and pasting a lot, you're doing it wrong
  96:              using (var repository = new Database1Entities())
  97:              {
  98:                  var userQuery = repository.Users
  99:                      .Where(UserNotTwo())
 100:                      .Where(UserNotBrian()).Where(NewMethod());
 101:   
 102:                  var usrList = userQuery.ToList(); //lazy evaluation "combines" both criteria 
 103:              }
 104:              #endregion
 105:   
 106:              #region SelectMany Flattening
 107:              //Flattens collections of collections
 108:              //Fairly common for a collection to contain collections with navigation properties in Entity Framework
 109:              //Necessary because there's no looping in a single LINQ expression--You'd have to bring the data down and loop yourself
 110:              //http://blog.falafel.com/Blogs/adam-anderson/2010/06/29/Flatten_Nested_Loops_with_LINQ_and_SelectMany
 111:   
 112:              using (var repository = new Database1Entities())
 113:              {
 114:                  var complexList = new List<List<string>>() { new List<string> { "test", "test2" }, new List<string> { "test3", "test4" } };
 115:                  var easyList = complexList.SelectMany(lst => lst);
 116:                  
 117:                  var superComplexList = new List<List<object>>() { new List<object> { "test", "test2", 25 }, new List<object> { "test3", "test4", new List<string>{ "test"} } };
 118:                  var kindOfEasyList = superComplexList.SelectMany(lst => lst);
 119:              }
 120:              #endregion
 121:          }
 122:   
 123:          private static Expression<Func<User, bool>> NewMethod()
 124:          {
 125:              return tt => tt.Email != "brian.rosamilia@gmail.com";
 126:          }
 127:   
 128:          #region Helper Functions
 129:          private static Expression<Func<User, bool>> UserNotBrian()
 130:          {
 131:              return usr => usr.FirstName != "brosamilia@example.com";
 132:          }
 133:   
 134:          private static Expression<Func<User, bool>> UserNotTwo()
 135:          {
 136:              return user => user.UserId != 2;
 137:          }
 138:          #endregion
 139:      }
 140:  }

Monday, February 25, 2013

6 Simple New(er) C#/.Net features you may have missed

So you're on an interview for a .NET job and you're feeling pretty good about yourself.  You've been using Visual Studio 2012 and .NET 4.5 for the last few months and your resume indicates the same; and then this question :

What are your favorite new features in C# 5.0 and .NET 4.5?

Uhh.. Well.  I....

It's actually a pretty difficult question this time around because it's not a major release and I would say that 3.0/3.5 was the last time I observed a radical change in the way we we write C#.

But could you even answer the question for .NET 4?    In this post I'm going to try to remind you of some of the features of C# 5.0/4.0 and .NET 4/4.5  that are both easy to remember and use.


BigInteger 


I thought we'd start simple.  In the Framework Class Library, there is a new DLL (System.Numerics) with  only two public types in it; one of them is the BigInteger struct.  BigInteger allows for arbitrarily long integers and could find use in scientific programming.

Tuple 


One of my favorite features in .NET 4 is the addition of the Tuple.  Tuples allow you to easily capture related values without having to go through the trouble of creating a new class.  It's a much more flexible concept than the KeyValuePair class because they can hold up to 8 values.

Tuple is a borrowed concept from F# that's been wedged into C# using the power of generics (and the copying and pasting of a class about 8 times).

Here's an example of the usage


   1:  public void Tuple()
   2:          {
   3:              var tup = new System.Tuple<int, string, DateTime>(1, "Brian", DateTime.Now);
   4:              Debug.Write(tup.Item1 + tup.Item2 + tup.Item3);
   5:          }


Here's the class I used above (The version that holds 3 generic values)


   1:  namespace System
   2:  {
   3:      [Serializable]
   4:      public class Tuple<T1, T2, T3> : IStructuralEquatable, IStructuralComparable, IComparable, ITuple
   5:      {
   6:          public Tuple(T1 item1, T2 item2, T3 item3);
   7:   
   8:          public T1 Item1 { get; }
   9:       
  10:          public T2 Item2 { get; }
  11:       
  12:          public T3 Item3 { get; }
  13:   
  14:          public override bool Equals(object obj);
  15:          
  16:          public override int GetHashCode();
  17:        
  18:          public override string ToString();
  19:      }
  20:  }

Interesting to note that, yes there are 8 different versions of the class to support Tuples that hold up to 8 values.

Caller Information Attributes


Calling method arguments are a very useful feature implemented in the C# 5.0 compiler.  Now you can get information about how your method was called.  This is a huge improvement on having to look into the stack trace in order to achieve diagnostic level logging.  Here's an example of how it would work in a method that takes a string parameter.














Named and optional Arguments

Named and optional arguments are one of my favorite features because they allow API designers to have a LOT of calling options (as long as there are defaults) but not make your code verbose; I find any API taking advantage of this to be pure bliss.

Named Parameters allow you to specify  your parameters in any order and omit any parameters with defaults as long as everything without a default value is specified.

So for example, both of the following method calls are valid


   1:          public void NamedParameters()
   2:          {
   3:              OutPut(message: "Test");
   4:              OutPut(message: "Test", newLine:false);
   5:          }
   6:   
   7:          public void OutPut(bool newLine = true, string message = "TestMessage", bool AddAmperSand = true)
   8:          {
   9:              Debug.Write(message);
  10:              if (AddAmperSand) Debug.Write("&");
  11:              if (newLine) Debug.Write(Environment.NewLine);
  12:          }


Named parameters have another benefit; if you're working with a lot of code that has verbose calls that look like SendMessage(true, true,"messageData", false, 5)  and are completely unreadable because of the number of parameters.

you can refactor them into something like SendMessage(Repeat:true, Tolerance:true, Data:"MessageData", Alerts:false, RepeatSignal:5)

Of course this isn't a hard and fast rule, but more of a preference (I probably still lean toward the original.  There could be even more arguments though.).

ExpandoObject


An object you can dynamically add members to.  Lack of practical examples aside, there is a lot of possibilities because you can nest ExpandoObjects to create structured data (think XML-like/HTML-like).  You can think of ExpandoObjects as a Dictionary on steroids.  ExpandoObject is apart of the much bigger support for the Dynamic type introduced in C# 4.

   1:  public void ExpandoObject()
   2:          {
   3:              dynamic sampleObject = new System.Dynamic.ExpandoObject();
   4:              sampleObject.Test = "TestDynamic";
   5:              sampleObject.InvokeExpandoObject = new Action(() => Console.Write(sampleObject.Test));
   6:              sampleObject.InvokeExpandoObject.Invoke();
   7:          }

Parallel.ForEach


This concept is a small part of the much bigger task parallel library introduced in .NET 4.  The task parallel library is a collection of classes designed to make threading and parallel code easier to write.

If you have a list that needs to be processed, and you don't care about the order in which your items are processed, parallelizing your loop is likely a good idea. This is a really powerful concept because you aren't actually working with tasks at all and get the benefit of having your work parallelized.



Here's a simple example where we output a list in parallel.  You'll see the names don't come out in their original order.  Also, it should be noted that the overhead of the TPL is almost definitely not worth it in this scenario and is actually probably slower than a regular loop.

Conclusion / Download

You'll notice I left out Async/Await, Dynamic and Covariant/Contravariant generics and probably a bunch of other features.  The reason was to show you some of the SIMPLEST new features that will make your life easier.

It should be noted that the only feature I went over that's actually new in C# 5/ .NET 4.5 is calling method information--everything else is new as of .NET 4.

You can get the examples from this post here on bitBucket.  The total of all the examples is 70 lines!  Hopefully you've taken something away from this post.