Notes SharePoint Blog

Steve Walch's blog about his favorite migration tool and other things related to Lotus Notes migration projects

Category Archives: SQL Server

Getting Started with Migrating to SQL Server Data Normalization: Images and Attachments

In yesterday’s article [link] I provided a walkthrough of migrating to a flat SQL Server table.   Now we will start digging into data normalization with related tabled.  Here we will look at using a One-To-Many table to store binary objects and in the next article we will examine Lookup tables.

Imagine that I designed a SQL Server database with a main “Documents” table and related “Parts” table for storing migrated images, attachments and embedded objects.  I wanted a separate table here because there is a one-to-many relationship between documents and images. 

The basic design for my database is shown below.  In addition to the BinaryData column (type VarBinary) to hold the raw binary data, I added DocID (int) and PartID (string) columns which together constitute a unique primary key.  Finally there is a PartName column (string) which stores the friendly display name for the attachments.   

image

Now in my Target Data Definition, I specify my Parts table as a “Normalized Table” with the relationship One-To-Many.  If you are lost at this point, go back and review the prior article “Getting Started with Migrating to SQL Server Databases”. 

image

In the Data Fields tab of my Target Data Definition, I need to define two columns for migrating content to.  (Unfortunately the “Load From SQL Server Database” button is not smart enough to do a complete job here, so you will have to do much of this manually.  If you press that button, make sure you review the following details carefully afterwards.) 

First I defined my DocId field for the Parts table and I set the “Link To Primary” property to True.  This tells Notes Migrator for SharePoint that the Primary Key of the Primary Table should always be copied to the DocId column of the Parts table.  This will happen regardless of whether the Primary Key was set from Notes data or was an auto-generated Identity column.

image

Next I defined a field of type File.  I called it “BinaryData” but I could have called it anything I wanted to.  Using the File Data Column, File Key Column, and File Name properties for my File field, I specified the three columns in my database described above for storing information about the migrated binary files.  The File Data Column property indicates where the actual binary data should go (the BinaryData column in my case).  The File Key Column property indicates where the unique attachment/image/object ID should go (PartID).  The File Name Column property indicates where the friendly display name should go (PartName).  Note that the File Name Column property is optional and the values written to the indicated column are not guaranteed to be unique.

image

Finally I set the Is Gating Field property for my File field to True.  This is very important, as it controls the normalization of records; if I map an array of 10 values to this field it will cause 10 records to be written to the Parts table. 

As shown below, I actually map all the attachments, images, and embedded objects to this field.  So if a Notes document had 6 attachments, 2 images and 3 embedded objects, we would get 11 new records in the Parts table.

image

After running my migration job, we can see that the DocID column values in the Documents table are automatically correlated with the DocID column values in the Parts table, and that it is not a one-to-one relationship.

image

One final question remains here… If all my images and attachments are stored in a separate table, how does the migrated rich text field (which is now an HTML field in the Documents table) refer to them?  I controlled this by using the “Generating Links to SQL Records” options on the Advanced tab of the migration job.  Here you can specify how links to images, attachments and even other documents should be formatted in the migrated rich text.  This feature is optional, and only makes sense if you have designed a web application to display the images and attachment files in response to a web request. 

image

Using the option shown above, the following attachment link and image reference tags were generated in my rich text document:

<a href=”http://quest.com/parts?server=REMGW0MTL1%5cSHAREPOINT&database=QDemo&table=Parts&id=2382&part=AttachmentLinks.docx”>

<img src=”http://quest.com/parts?server=REMGW0MTL1%5cSHAREPOINT&database=QDemo&table=Parts&id=2382&part=image0.GIF” border=”0″ />

Getting Started with Migrating to SQL Server Databases

Notes Migrator for SharePoint 6.0 now allows you to migrate Lotus Notes content directly to SQL Server databases.  We have seen three main cases where customers are considering doing that as opposed to migrating to SharePoint lists and libraries:

  1. They do not feel that SharePoint Lists are adequate to the needs of the job.  In this category are concerns about SharePoint scalability and volume, as well as the lack of support for transactions.
  2. They want to access the data outside of SharePoint for some reason.  Perhaps the data needs to be accessed by multiple applications (CRM systems, etc.) or the enterprise is trying to achieve a consistent architecture for modeling/storing/archiving all types of business data.
  3. They feel that they can rebuild the application design more effectively using .NET, Java or some other app dev tool.

Some people consider #1 and #3 to be issues of “application complexity”:  If an app is too complex, it should become a SQL Server / .NET solution because they assume that SharePoint can’t handle it.  While we never completely agreed with this thinking, and with SharePoint 2010 it is even less true, there are still good reasons to want to migrate to SQL Server and we are happy to start supporting it.

This article covers migrating to a simple flat SQL Server table.  Migrating to multiple tables (and normalizing the data along the way) is a more complex subject that will be covered in a subsequent article.  For now, know that this feature is capable of maintaining separate tables for binary images and attachments, normalizing multi-valued item to related tables, and managing references to lookup tables.

Migrating to SQL Server is currently only supported in the Designer client, not the Migration Console.  The first thing you need to do is start Designer, go onto the tool Options, and enable SQL Server migrations in the last tab.  You should also specify the credentials you want to use for connection to SQL Server here.

image

Once you have done this, you will now see a new drop-down selector for the target type on the target data definition tab.  This is where you indicate that a particular job is intended for SQL Server instead of SharePoint.

image

Press the Select button to specify the database you want to connect to.  (Sorry there is no “picker” here yet but there will be.)

image

Back on the SQL Server tab in your migration job, you can verify your connection using the Test button.  Remember that you can change your credentials for connecting to SQL Server in the Options dialog, as described earlier.

image

Next press the Create button to start designing a target data definition.  As with SharePoint target data definitions, this is where you define the schema for the target you are going to migrate to.  (One big difference here is that our tool does not currently allow you to provision SQL Server tables – you have to define your new schema using SQL Server Management Studio or your favorite design tool.) 

The first tab of the SQL Server data definition editor is where you specify the table you want to migrate to.  Specify the Primary Table using the drop-down picker.  (We will explore the use of additional “Normalized Tables” in a subsequent article.  For now, lets stick to just one table.) 

image

On the second tab is where you define all the fields in your target table.  You can specify them manually if you want to, but the easiest way to do this is usually to just press the “Load from SQL Server database” button.  This should populate the data definition with all the fields you might want to migrated to.

image

Each target field has a number of properties that control how Notes Migrator for SharePoint will migrate data to it.  The most important of these is the Type property, which may be set to a variety of standard SQL Server column types.  We also have types for documents that our tool knows how to generate (Word, PDF, and InfoPath documents) which may be written to SQL Server as binary files.

image

A few other are other properties worth mentioning here: “Is Html Field” controls whether or not the tool should perform the usual rich text fix-up for DocLinks, attachment references, etc.  “Is Windows Name” controls whether the tool should attempt to map Notes names to Windows names using the configured user/group mapping options.

image

Two interesting properties relate to columns that are part of the table’s primary key.  These are not critical in a simple flat table like this, but become very important when you get into data normalization (more later). 

image

Save your data definition and then proceed to the Map Data tab on your migration job.  The mapping of Notes columns to SQL Server field is the same as it is for SharePoint fields, including the ability to use formulas or constant values.  

Note that if a column is an auto-numbering Identity Column (like the DocID column shown above), you have a choice as to whether you map Notes data (such at the Notes ID) to this column or let SQL server assign the values for you.  If you want the later, simply refrain from mapping anything to this column.

image

One thing that is missing on the Map Data tab is the checkboxes for auto-mapping document metadata, etc.  Similarly on the Advanced tab there is no option for mapping permissions.  The reason is that SQL Server does not have explicit support for these types of constructs the way SharePoint does.  Of course, you may choose to add columns to you table that capture such information and build applications that use them.  In the above example, we are mapping the Created Date and Created By metadata to SharePoint columns, but from the tool’s perspective there is nothing really special about them.

When you are satisfied with your mappings, press the Run Job button and start migrating!  You can use tools such as SQL Server Management Studio or Quest’s own Toad for SQL Server to view the results.

image

One issue that comes up when migrating to SQL Server is that the server tends to be very sensitive regarding data validation.  Whereas the SharePoint APIs tend to be more forgiving and clean up data as you go, SQL Server will throw a fatal exception and an error message.  While we hope to get a little smarter about this over time, Notes Migrator for SharePoint currently gives up and simply displays the SQL Server error messages in many cases.  The result for you may be that a little trial and error needed in order to achieve a successful migration.

Note that, unlike with SharePoint, our tool does not give you the option to delete the prior contents.  You are always in “append” mode here so if you want to re-migrate you will need to clear the results of your previous job manually.  There are other things we did with SharePoint migrations that we could not do with SQL Server migrations:  detecting and merging duplicates, out-of-band metadata migration, and out-of-band security mapping. 

The good news here is that in the SQL Server world, we finally have transactions.  So if a large migration fails halfway through, we can (and do) roll back the entire job.

One final observation about the brave new world of migrating to SQL Server:  SharePoint 2010 now makes it easier than ever to bring SQL Server data back into the SharePoint user interface using External Lists, InfoPath List forms, etc., so you can think of this new feature as giving you a choice as to where the back-end data lives, even when you know you are going to build the front-end in SharePoint.

Follow

Get every new post delivered to your Inbox.