That Didn't Last Long

When working on DataMgr 2.0, I had several beta versions and a handful of release candidates as well. I was hoping that I would be able to keep the nice version number "2.0" for a while before I had to have a bug fix that would force an ugly version number.

Well, DataMgr 2.0 came out on Monday and today (just Thursday of the same week) a bug has already been found. So, DataMgr is now in version 2.0.1 (just the sort of ugly version number that I was hoping to avoid).

So, if you have already downloaded 2.0, you might want to grab 2.0.1. You can see a description of the issue being solved on the DataMgr page on RIAForge.

In good news, this weeks presentations on DataMgr seemed to go well. You can read Dave Shuck's review on my presentation to the Dallas group. I forgot to record that presentation, but I did remember to record the presentation to the Cleveland group.

As a bit of amusement, the code inside DataMgr that I showed during the questions portion of the presentation was the very code causing the problem that required me to post a new version of DataMgr.

The presentation is about 50 minutes long, including questions - pretty quick for me.

I noticed I had the syntax wrong for putting relational data in your seed data (something I cover correctly in the documentation). 

Presenting on DataMgr

Following on heels of the release of DataMgr 2.0, I will be giving online presentations on DataMgr to two users groups.

I will be presenting to the Dallas CFUG today and the Cleveland CFUG tomorrow. Apparently, I like to do users group presentations in pairs.

I have done a few presentations in person (one of which was recorded), but these will be my first presentations to be done purely across the internet. I am a bit nervous as I generally get a lot of feedback from watching the faces of my audience and I won't have that this time.

Hopefully it will go well and be worth the time for those who attend (always a big pressure). 

DataMgr 2.0 is Gold!

Data Access Layer, DataMgr 2.0 is finally in full release! It automates CRUD functionality without creating code and supports OO or procedural coding.

Data encompasses three major types of functionality:

  • CRUD: Database reads/writes including those of the type performed by cfinsert/cfupdate
  • ActiveSchema: The ability for your code to define your database structure. DataMgr can introspect the database structure or it can define it.
  • Prototyping: The ability to use simulated data for prototyping (much like QuerySim, but more powerful and less work).

DataMgr is different from other available options because it doesn't generate any code. It isn't an ORM solution, but you could easily build an ORM solution on top of it.

The syntax for using DataMgr is the same for any supported database and you can add support for a new database by adding a new CFC (about 250-300 lines of code, on average). DataMgr currently supports the following:

  • MS Access '97+
  • MySQL 4.1+
  • PostGreSQL 8+
  • SQL Server 2000+ 

If you want to upgrade from DataMgr 1.X, copy the new DataMgr files over the existing DataMgr files and then reload the DataMgr component and any components that use it (usually via ?reinit=1 or some such). No change in syntax is required.

DataMgr only incurs a minimal performance overhead when compared to SQL in cfquery.

Built into DataMgr's CRUD functions are better error handling (no more "String or binary data would be truncated" errors) and automatic conversion of special MS Word characters to their ASCII equivalents.

DataMgr can introspect your database or you can define your table structure and have DataMgr create it. You can even have DataMgr create tables with data.

In addition to the supported databases, you can also use a simulated database. This effectively creates QuerySim-like queries without having to type out any data (using table definitions as above).

DataMgr also include several other features (discussed in the beta release of DataMgr 2.0) which can be seen on the DataMgr demonstration site and are included in the DataMgr documentation.

DataMgr is open source an free for any use and can be downloaded from datamgr.riaforge.org.

The Best Laid Plans... (and DataMgr 2 RC4)

Most of my career, I have been either a solo developer or the most senior developer on a team. I have created a handful of open source applications in part to get feedback from other developers about my work. Be careful what you ask for...

I thought that DataMgr RC3 was really the final release under a different name and advertised it as such on my blog. Such hubris could only cost me. I convinced Peter Bell to take a look and he discovered some fairly significant bugs - including a whole feature that I had completely failed to test. He was, of course, very nice about it. Still a bit embarrassing moment none-the-less.

Just days later my friend, Jason Holden, discovered another bug (or rather, a missing feature). So much for "ready for prime time". These bug discoveries came after DataMgr 2.0 has been in production in some form for several months and I have run several tests on it (though clearly not enough).

I have since fixed the bugs that have been discovered and done a bit more testing. This has resulted in one more pre-Gold release, DataMgr 2.0 RC4. In all likelihood, this still won't be the final pre-release version as I have one more adjustment that I would like to make before DataMgr 2 goes gold.

Clearly I have a thing or two to learn about good testing. I have been looking into unit testing, but DataMgr can handle such complex situations that the idea of testing every combination of interactions seems a bit overwhelming.

Anyway, DataMgr 2.0 RC4 has been running without incident for a few days now and is ready to download.

If you find any problems with it, let me know. 

DataMgr Rethinks the Data Access Layer

Prolific blogger, Peter Bell, just posted an entry about "Rethinking the Data Access Layer" wherein he mentions DataMgr and ORM solutions and lists his requirements for a database abstraction tool.

I couldn't resist responding with how DataMgr meets those requirements.

Speaking in Tongues

DataMgr provides one API that works regardless of your database. This currently includes MS SQL, Access, MySQL, and PostGreSQL. Others should be easy to add.

Get Associations

Peter mentions the desire to have CategoryService.getAssociatedProducts(CategoryID). While DataMgr won't do this directly (it is a database astraction layer, not an ORM), it does provide capabilities that are helpful in this regard.

Assuming you use XML to define your relationship (you could use setColumn() instead):

<table name="categories">
  <field ColumnName="CategoryID" CF_DataType="CF_SQL_INTEGER" PrimaryKey="true"  />
  <field ColumnName="CategoryName" CF_DataType="CF_SQL_VARCHAR" Length="80" />
  <field ColumnName="Products">
    <relation table="products" type="list" field="ProductID" join-field="CategoryID" />
  </field>
</table>

Then getRecords("categories") would return a field called "products" with a comma-delimited list of ProductID values from the products table for the category in each row.

Cascading Deletes

Taking the example above, the relationship element has an attribute of "onDelete" with a default value of ignore. In the above example, the deletion of a category will have no effect on the products table.

You can change the element to this:

<relation table="products" type="list" field="ProductID" join-field="CategoryID" onDelete="Cascade" />

In which case all related products would be deleted.

Similarly, you could do this instead:

<relation table="products" type="list" field="ProductID" join-field="CategoryID" onDelete="Error" />

In which case DataMgr would throw an error if you tried to delete a category that had related products.

The advantage of this over database constraints is that this works with logical deletes as well.

Manage Joins

I'm not honestly sure I understood what Peter want here, but DataMgr handles several types of joins. The list type (mentioned) earlier can also be used in saving for a many to many relationship (a join-table attribute is needed).

It also has a label relation field:

<table name="categories">
  <field ColumnName="ProductID" CF_DataType="CF_SQL_INTEGER" PrimaryKey="true" />
  <field ColumnName="ProductName" CF_DataType="CF_SQL_VARCHAR" Length="80" />
  <field ColumnName="Category">
    <relation table="categories" type="label" field="CategoryName" join-field="CategoryID" />
  </field>
</table>

Using this XML, getRecords("products") would return a recordset with a "Category" field that would show the CategoryName of the category for the product in each row.

Magic Fields

In DataMgr, these are "Special" fields.

<table name="categories">
  <field ColumnName="ProductID" CF_DataType="CF_SQL_INTEGER" PrimaryKey="true" />
  <field ColumnName="ProductName" CF_DataType="CF_SQL_VARCHAR" Length="80" />
  <field ColumnName="Category">
    <relation table="categories" type="label" field="CategoryName" join-field="CategoryID" />
  </field>
   <field ColumnName="DateAdded" CF_DataType="CF_SQL_DATE" Special="CreationDate" />
</table>

They include the ability to store the date a row was added or the date it was last updated as well as the ability to automatically handle manual record sorting as well as logical deletions.

They don't have the ability to handle other types of automated data that Peter mentioned, but I would be open to adding more Special types in future versions.

Everything I Need - and Nothing More

I think this is where DataMgr shines. It doesn't enforce an OO paradigm (although it could be used in one - one small ORM has already been built on top of DataMgr). The performance overhead of DataMgr is similarly minimal. 

Aggregate Subqueries

Peter mentioned this need in the comments. DataMgr handles this as well.

The best examples are found in the aggregates page of the demonstration site. 

Plenty of other examples are available as well. Check out the DataMgr demonstration site , or download CodeCop for a working application running on DataMgr. 

Almost Done!

One of my new quarter's resolutions was to finish DataMgr 2.0. March is more than half over, so I am running out of time.

Fortunately, I have RC3 done. I didn't add any new features, but I did fix a few small bugs. This included a bug in MS Access and a bug in seeding data upon table creation.

Assuming that I get DataMgr 2.0 finished by the end of this month (and I think I will), I will get at least half of my goals met. Certainly not as good as I had hoped. Nevertheless, I really like the format of the quarterly resolutions over annual resolutions. I feel like it has kept me more focused.

The build for DataMgr 2.0 RC3 should exactly match the build for the final version. Unless, of course, more bugs are found. As the latest bugs were pretty minor, I am optimistic that they won't be.

Overall, I am really happy with the new version. I got in all of the major features that I was hoping to. Of course, I have since thought of one or two more features that I might add into a point release later. That will probably be a little ways off though as I am ready to work on other things (some of which will take advantage of DataMgr).

DataMgr Performance

I finally got around to running some performance tests on DataMgr and I must say that I am surprised by the results. I ran test of 1000 iterations on three actions: insert, get (one record) and list (getting a recordset of about 2000 records) all against MS SQL.

I used straight cfquery as my basis of comparison. I knew that cfquery would be faster, but I was surprised at how close the results were.

insert

  cfquery DataMgr
Avg: 1 7
Min: 0 0
Max: 51 120

get

  cfquery DataMgr
Avg: 1 5
Min: 0 0
Max: 40 100

list

  cfquery DataMgr
Avg: 379 377
Min: 190 200
Max: 931 591


Although the percentage difference between cfquery and DataMgr may look significant for insert and get, note that DataMgr added - on average - only between 4 and 6 milliseconds of processing time. For retrieving several records, DataMgr didn't incur any noticeable performance impact at all.

The upshot of this is that although DataMgr does add some processing overhead, it is very minimal.

I hope to eventually do more performance tests, hopefully adding in DataMgr 1.0 (or 1.2) to the comparison as well as perhaps adding in some ORM solutions for comparison as well.

DataMgr 2.0 RC2 is feature complete. Database testing did find a few small bugs, so I will have a RC3 release prior to the final 2.0 release.

DataMgr 2.0 RC2 and Performance Testing

The second release candidate of DataMgr 2.0 is ready. I said last time that DataMgr 2.0 was feature complete, but the Nashville CFUG convinced me otherwise.

The getXml() method now returns XML for all tables if no table is passed in or if a blank string is passed in as the table name. It also now returns the XML for the Relation Fields - not sure how I missed that before.

The logAction() method now records the SQL that was used to perform an action. It converts the cfqueryparam tags into values in parenthesis for easy reading. You can also call this method even if DataMgr isn't logging (though you need to have called startLogging() at least once so that the table is created).

The documentation hasn't yet been updated with these changes, but the DataMgr demonstration site is running the newest build. 

When I visited Boston and Nashville I was asked about DataMgr's performance. As a result, I plan to run some performance tests on DataMgr. I haven't decided, however, what kind of tests I should run.

So, if you have an opinion on what kind of performance testing I should do, let me know.

Maybe I could even run the same tests on other options (cfquery / objectBreeze / Reactor / Transfer ) for comparative purposes.

 

DataMgr Tour of Two Cities

I got back Friday from my two city DataMgr users group tour. I had a lot of fun presenting in both Boston and Nashville. Both groups were a lot of fun and had good questions.

While in Boston, I had a chance to look around the the city for a few hours before the presentation and it is really beautiful. Then I had time to talk to both Brian Rinaldi and Bernie Dolan a bit before the meeting, which was both fun and educational - two very smart guys.

The presentation to the Boston CFUG went well and everyone I talked to said that they enjoyed the presentation. Feel free to read Bernie's humurous tale of the event (I took the title of this blog entry from his). Afterward, I went to a local pub for a drink with Bernie, Brian, and Tom. Tom bought me a beer, which was much appreciated. Another fun and informational discussion there.

From Boston I flew to Nashville. I was too tired to look around the town, so I stayed in the hotel and napped and tried to get caught up on email and reading. The presentation to the Nashville CFUG also went well (you can read Steve "Cutter" Blades description of the presentation). This presentation was recorded, so feel free to watch the presentation (apologies for some silent spots where I am listening to questions and suggestions).

Some of the Nashville members made some good suggestions for DataMgr which will be in the next build (proving wrong my previous statement that it was feature complete).

The first suggestion was that the getXml() method should be able to return all tables if no table was passed in to the method. It turns out that DataMgr did this already. I went ahead and made it do the same if an empty string is passed in as well.

The second suggestion was that the logAction() method should record the SQL used in the action. I have added that feature and it will be in the next release. As an aside, I said in the presentation that calling this method directly when logging is not enabled would not cause DataMgr to log an entry. This is not true, but startLogging() must have been called before a logAction() method is called (or else the logging table won't exist).

Members of both groups asked about the performance of DataMgr. I was disappointed that I had not done any performance tests on DataMgr, so I had to give a conservative response that it would be best to avoid its use in very high-traffic situations. I believe that it should actually perform very well in those situations (better, I would think, than an ORM approach), but until I test that I can't really recommend its use for high-traffic applications.

That being the case, I would love any suggestions for what people think would be the best approach for load testing and performance testing for DataMgr. Ideally, I would love to see a set of tests that can be performed against DataMgr as well as cfquery, Transfer, and Reactor. I think that this could help people choose the approach that is the best fit for their environment. 

Somehow, I had time to cover "Special" functionality and "Relation Fields" at the Boston meeting, but not the Nashville one. I'm sure that means I left out something else in Nashville. Both of these features can be seen in the DataMgr Demonstration site. If you want a more complete information, the DataMgr page includes links to the documentation as well as Flash tutorials and the CFC Docs for DataMgr.

I would love to present this to any other groups that are interested (via Adobe Acrobat Connect). Just let me know if you would like me to present on DataMgr to your group).

Finally, a big thanks to Brian Rinaldi and Aaron West for allowing me to come to their groups and say a few words about my current pet project. It was a lot of fun to meet everyone in each group and I hope I get to see many of them again in the future. 

DataMgr 2.0 Release Candidate

DataMgr 2.0 has reached Release Candidate stage. It is now feature complete. The best new feature? Documentation ! I now have a full 15 pages of documentation available.

The documentation is of a first-draft quality, but it is still better than what I have had already. It should provide at least an overview of all of the major (and most of the minor) functionality of DataMgr.

Other new features include:

  • The ability to only return structure data for only one table from getTableData()
  • Allow saving of concat Relation Field value (where list length matches)
  • A new choice for insertRecord: "OnExists" argument: "save" will update for primary key match and insert in all other cases
  • Some performance improvements
  • A small bug fix on insertRecord()

If you are unfamiliar with DataMgr, it is database abstraction tool. You might think of it in the save class of tools as ORM or a SQL Generator, but it is different from each of them.

DataMgr doesn't enforce an ORM style of interaction (though it could be used as the data-access layer for one). It also doesn't generate any code. This allows more dynamic and flexible interactions than are generally provided by generated SQL.

DataMgr does provide CRUD (Create Read Update Delete) functionality. Just because you use DataMgr for some of those interactions, however, does not mean that you need to use it for all of them. I generally use DataMgr for my insert/update/delete actions, for example, but I frequenty handle my select queries outside of DataMgr (though much less frequently as of 2.0).

DataMgr also simplifies a number of other common interactions that I have not seen other similar tools address (and certainly not with the same power and simplicity). For example, it simplifies setting sort order values and handling many-to-many relationships.

The biggest advantages for DataMgr come from version 2.0, where it adds features that I haven't seen elsewhere. For example, a simulated database which can allow you to switch between a simulated database (which provides QuerySim like functionality with the randomly generated data) and a real database by merely switching which DataMgr component you use - no need to change any of the code that uses DataMgr. This is ideal for prototyping an application.

The Relation Fields feature in DataMgr allows you to treat different types of data as though they existed as field in your database without actually creating that field in your database (a "FullName" field that uses FirstName and LastName fields for example or a "Category" field that gets its data from a field in a related database).

It also has several small touches that are generally helpful. It reports which field has too much data, if one does (to avoid the unhelpful "data truncation" errors). It also replaces the special characters for quotes and dashes that MS Word uses with their ASCII equivalents.

The documentation covers these features and more in detail, but hopefully this provides a general overview of what DataMgr does. It currently runs on MS Access, MySQL, MS SQL Server, and PostGreSQL with the exact same API. It is also easy to add support for another database.

If you would like to see DataMgr in use, take a look at CodeCop which serves as both a robust code checking application and an example DataMgr application. You can also view the demonstration site to see DataMgr in use. 

DataMgr is free and open-source.

More Entries

BlogCFC was created by Raymond Camden. This blog is running version 5.8.001.