Load Any Object From Most Any Database
Load Any Object From Most Any Database
com/Articles/57062/Load-Any-Object-From-Most-Any-Database
Load Any Object From Most Any Database
Alaric Dailey
8 Feb 2010
Building An O/R Mapper Step 1
Introduction
It seems like everyone is now writing Object Relational Mappers, some good, some not so good. Opinions
aside, (N)Hibernate is probably the most famous; however, sometimes it's more than is necessary, and
sometimes it's just not what you are looking for. Sometimes you have to use databases other than what the
host of existing O/R Mappers support. Whatever the reason, the premise of this article is filling your objects
from most any database table. In fact, if you can connect to it with a .NET dataprovider, and write a query to
get the data, then this code can be used to fill in your objects.
Background
I am not going to cover a few things, to keep things simple.
Here are some of the things I am not going to cover:
Getting the connection string and provider name from the config, and connecting to pretty much any database.
Because I covered it in this article.
Inherited objects
Creating custom attributes. This could be used to solve some of the problems I am going to show. This is
because this article is only a place to start, and not a complete solution.
Connection pooling to enhance performance
Threading to fill the objects, though if you want a good place to start, you will need a threadsafe list. So you
may want to check out my articles on a Thread Safe List, and my other article on Simplifying
ReaderWriterLocks
Tracking changes to the objects. Too much to cover in this article.
Saving the objects back to the databases.Too much to cover in this article.
For starters, I really don't want to mess around with Activator.CreateObject, so I will create a generic function.
Next, I do not want to tie this to any particular DataProvider so my plan was to use DbDataReader,
however Resharper was nice enough to point out that I could use the IDataReader interface to be even more
generic. Because of the uniqueness of the function, I will touch on just the declaration first.
I marked the function as static because I am not touching any member variable from the containing class.
The BindingList<T> will give the most flexibility for binding to Windows controls. Probably the most confusing
thing to people who haven't used Generics (or templates) much is the <T> at the beginning of the function
declaration. That is going to allow me to use mostly any object I like and return it in a strongly
typed BindingList. The where T : new(), tells the compiler that not just any object will do, but that the object
must have a public parameterless constructor. So let's have a look at the body of the function.
//Mapping Block
List<PropertyInfo> itemProperties = new List<PropertyInfo>();
for (int i = 0; i < dr.FieldCount; i++)
{
string s = dr.GetName(i);
1
var pi = type.GetProperty(s, BindingFlags.Instance |
BindingFlags.Public | BindingFlags.SetProperty);
itemProperties.Add(pi);
}
//data block
object[] oo = new object[itemProperties.Count];
while (dr.Read())
{
dr.GetValues(oo);
//could be threaded block
T item = new T();
int fieldIndex = -1;
foreach (var pi in itemProperties)
{
fieldIndex++;
if (pi != null)
{
object o = oo[fieldIndex];
if (DBNull.Value.Equals(o))
{
o = null;
}
try
{
pi.SetValue(item, o, null);
}
catch
{
//eat data errors quietly
}
}
}
retval.Add(item);
//end of could be threaded block
}
return retval;
}
I will break this down by the comments since the code is a little different than what most people are used to.
Mapping Block
To start this function, after declaring some variables, I loop through all the fields in the datareader, getting their
names by ordinal (index position). Then using reflection, I find the property by that name.
The BindingFlags makes sure I get the property if it has a public set function, and is not a static property.
Regardless as to whether or not the property is found, I add it to my collection.
Data Block
Now I loop through all records in the datareader. Rather than looping through every field in the datareader, for
every record, I get all fields into my array of objects with dr.GetValues(oo). At this point, I create a new object
of the proper type.
Now comes the fun part, for each record, I loop through the collection of properties, because they are in ordinal
order of the datareader, if a property was not found, or it did not have a set, I skip over it. This prevents me
from doing any unnecessary work.
2
Because NULL values in the database are not returned as null in .NET, they are returned as DBNull. I check to
see if it was returned for my columns value, and if so, I make it null.
At this point, I assign the value by calling the pi.SetValue, passing in the object that we are setting the value
on, and the value I am setting. However, since there MAY be a data mismatch, I wrapped the code into
a try/catch block.
Last thing, we add our newly filled in object to our BindingList<T>. When finished with all the records, I return
the binding list with all the fresh data.
Ideally using the code would look a little like this, taking into consideration that you aren't going to want to
spread your database code through your code. This block gives you what you need to make it work though.
Thus this code may be spread out over several functions in different places, like a database class, or a base
class for the business object.
Using the poorly performing "select * from" statement will pull all data from the table by using the name of the
class as the name of the table. After building the connection and command, we execute a datareader and pass
it to our newly created function.
Shortcomings
This function has some serious shortcomings, so let's make them clear. This code assumes the column names
and table names are all valid variable and class names in C#, so a column name like "Vehicle Identification
Number" won't work at all. The block I use to call the code derives the table name from the class name, this will
3
work in the majority of cases, most database designs don't take advantage of schemas, but this approach also
means that things like multiple databases or linked servers also can't be used.
Wrapping the data assignment in a try/catch block isn't the best solution, it's just the quickest solution to
preventing partially consuming the datareader, and throwing unexpectedly.
The select * from statement is always inefficient, it would be better to select the column names individually.
However, because there is no way to know which properties on an object may be column names, or might
have to be mapped because of column names that are not valid property names. A perfect example of this is in
my example database, in Vehicle there is a column called Vehicle Identification Number, because it contains
spaces, no property will ever be found with that name.
In this code, I am not wrapping the tablename in the proper characters to prevent issues with SQL and
database specific keywords. I didn't include this because getting that information reliably is a little more than
simply getting a commandbuilder. You could even get the command builder to fix your SQL statement using
the "one last trick" from one of my previous articles here.
This code will be slower than necessary because we are setting properties. Because of this, we lose some
benefit that we should have, the hiding and protecting of data. Do we really want the user to be able to
overwrite the data in the primary key column(s)? Tracking changes and valid values on the object when the
property is set is also much more difficult because of having to use the properties. This is because there is no
easy way to know if the property is being set by this function, or from some other piece of code.
Had we chosen to fill the private fields, possibly from a base class, and used custom attributes we could fix
some if not all of these problems.
Tracking changes to the class would be absolutely necessary to have the class be able to update the proper
row in the database.
In the LoadObjectsfromDataReader function, you will see a block marked "could be threaded block",
theoretically all the code within that block could be run on a separate thread, allowing faster reading of
the datareader. The issue with doing it is that all of the fills have to finish before the return can happen.
Something to think about though.
Points of Interest
Despite the shortcomings, this is still a reasonably efficient and very flexible way to get data from the database
and into objects, objects that we can then query with LINQ to Objects.
Loading objects from multiple data sources, from different systems using this approach and querying the
loaded data using "LINQ to Objects" is a great alternative to using some hard coded providers framework
like LINQ to SQL, especially when you need to correlate data from multiple disparate sources.
In future articles, I will start going over how to address the shortcomings.
History
6th February, 2010: Initial post
8th February, 2010: Article updated