Lotus Notes to SharePoint Blog

Blog about Dell's Notes Migrator to SharePoint tool and other things related to Lotus Notes migration projects

Category Archives: Document Libraries

Filter By Rich Text Content

In SharePoint, document libraries contain files. Every “document” is a binary file that you can download to your hard disk, etc. Notes document libraries are more flexible however. Some documents contain just one file attachment, while others may contain lots of rich text (and possibly multiple file attachments) in a rich text “Body” field. In fact, it is common to see mixed usage patterns in one document library, making it difficult to figure out the best migration target.

In the past, the best approach has been to migrate document libraries to a SharePoint list instead of a document library. SharePoint Lists mirror the flexibility of Notes documents well and allow you to capture your rich text (if any) in a “Body” field and migrate zero, one, or many attachments to the list’s attachment area. Unfortunately, you lose all the advantages of SharePoint document libraries with this approach. In the cases where you know that most documents are really just single file attachments, you would probably prefer to migrate those directly to a document library.

What many migration teams want to do is apply the following policy for document libraries:

  • For documents that contain just one attachment (and no other rich text), migrate the attachment directly to the SharePoint document library with all the appropriate security and metadata.
  • For documents that contain Notes rich text, generate a Word or PDF document and place it in the same SharePoint document library with all the appropriate security and metadata.
  • For documents containing neither attachments nor rich text, either skip the document or create a stub entry in the target library.

In order to implement this policy, Notes Migrator for SharePoint 6.2 now includes a new record filtering option for Notes and Domino.Doc data source definitions.

image

On the record selection tab, check the “Select documents based on Rich Text Content” checkbox. This will enable a Details button where you can specify further details. First specify one or more rich text items you would like to inspect. Second, specify the criteria you would like to use for filtering documents:

  • Whitespace only
  • One attachment only
  • Multiple attachments or other rich text

This new record selection option allow you to create multiple migration jobs for each document library, each one implementing one of the rules in the above policy. Remember that the Notes Migrator for SharePoint migration console makes it easy to sequence multiple migration jobs for one database, and to automate these jobs for many databases of the same type.

Also note that we are planning a similar feature for extracting QuickPlace and QuickR folders, but this is not available in the current build.

Direct Folder / Document Set Migration

In previous versions of Notes Migrator for SharePoint, users could map document metadata (for example the Category property of a Notes application or the {BinderName} property for Domino.Doc documents) that would cause folders to be created in SharePoint. Folders would be created as needed as documents were being migrated. This worked for most cases, but there were limitations that customers would occasionally ask about:

  • Since we only migrate folders as a “side effect” of migrating documents, there was no way to migrate empty folders.
  • Similarly, there was no way to create the folders ahead of time (before migrating the documents)
  • There was no way to set permissions, created/modified metadata, or additional data columns on the newly created folders

Now in Notes Migrator for SharePoint 6.2 we offer a way to do direct folder migration. This is really two separate features that work together…

Migrating records to folders

image  image

On the Advanced tab of your Target Data Definition, you can now indicate that you want to migrate to a folder in your target list or library.  In this mode of operation, every record you extract from the data source will result in a folder being created, instead of a document!  The only additional requirement is that you map at least one item to a target column of type Folder (which controls the new folder names). Many of the usual document migration features will now apply to folders including:

  • Mapping of permissions (using the “Map Reader/Author fields” checkbox on your Advanced tab)
  • Mapping created/modified metadata to folders (using the “Preserve Created/Modified” checkboxes on the Map Data tab)
  • Mapping additional data items to folders (requires creating a new Folder content type). 

Note that many job features that would apply to document migration will not apply to folder migrations. For example, document generation and duplicate document handling options would be disallowed in in this context.

Extracting information from Domino.Doc Binders

image image

One of the things that customers clearly want to do with this new feature is to migrate all the information from their Domino.doc Binders to SharePoint folders. To support this, we have added a new option to do exactly that in Domino.Doc Source Data Definitions. Simply check the Binders radio button on the Document Selection job, and now you are extracting Binders instead of Documents. Each row in the Preview represents a Binder in the current file cabinet and we have included additional columns for all of the standard Binder metadata available in Domino.Doc. Of course you can add additional columns to this query as well.

So putting these features together, you would typically map the {Title} property of your data source to a Folder column in your target. Simply checking “Map Reader/Author fields”, “Preserve Created/Modified identities”, and “Preserve Created/Modified dates” should bring over most of the other metadata but you can certainly add additional mappings if desired.

Note that this feature will only write new SharePoint folders; it will not update existing ones with the same name. So a best practice is to run the Binder migration job first (to create the folders with all the properties intact) and then run you normal document migration job.

Also note that we are planning a similar feature for extracting QuickPlace and QuickR folders, but this is not available in the current release.

Migrating to Document Sets

Similar to migrating to folders, Notes Migrator for SharePoint 6.2 also give you the ability to migrate directly to document sets. The situation here is very similar to what was described above. The tool already allowed creation of document sets as files within those document sets were being migrated. This is a powerful and popular feature, but suffered some of the same limitations.

  • Since we only create document sets as a “side effect” of migrating documents, there was no way to create empty document sets.
  • Similarly, there was no way to create the document sets ahead of time (before migrating the documents)
  • There was no way to set permissions or created/modified metadata on the newly created document sets separately from the documents.

The solution is similar to the folder solution described above. On the Advanced tab of your Target Data Definition, you can now indicate that you want to migrate to a document set in your target list or library.  In this mode of operation, every record you extract from the data source will result in a new empty document set being created, instead of a document!  The only additional requirement is that you add a target column of type DocumentSet and map a value to the DocumentSet.Name property. All of the other features of document set migration (described here) still apply. The difference is that every record you select gets mapped to a document set instead of a file within a document set.

Migration basics: Migrating to Document Libraries

From the perspective of a migration specialist, a library is very similar to a list. The main difference is that in a library, each “document” is an actual binary file with various data properties associated with it.

Therefore, migrating a Notes database to a SharePoint document library could be as simple as extracting binary file attachments out of each Notes document and placing them in the library.  This simplistic approach makes sense if the Notes application itself was designed to manage binary files—that is, if each Notes document is really just a wrapper around a binary file attachment.  Domino.Doc is an example of this type of application.  In the screen shot below Notes Migrator for SharePoint was used to extracted the attachments from each Notes document and placed them in a document library.  You can also extract various metadata items about each document and map them to SharePoint properties.

image

Be aware that several things can go wrong with this type of migration job. If there are no attachments in a particular Notes document (i.e., if it is just a normal rich text document), then nothing will be migrated to the library.  If there are multiple attachments in a particular Notes document, they may all be migrated to the library but they will no longer be one self-contained document.  In either case, you have probably misinterpreted the way the Notes application was used, and it should not have been migrated to a library in this manner. 

There are two answers to this dilemma.  The first possibility is to migrate the application to a list instead if a document library.  Even if it was called a “document library” in Notes, it may be more appropriate to map it to a custom list in SharePoint.

The second possibility is generate new documents (one for every Notes document) and check those into the document library.  Notes Migrator for SharePoint can convert Notes documents to a variety of file formats, including HTML, MIME, Word, PDF, and InfoPath.  These are the binary files that you check into user will open when they click on “the document”.   The three most popular choices for formatting Notes documents as binary files are discussed below.

Microsoft Word

Microsoft Word is a popular choice in environments that have standardized on Microsoft Office for document creation. The integration between Microsoft Office and SharePoint libraries is very good and can enable you to build a variety of powerful applications. Users can open documents from libraries, edit them and seamlessly save them back again. If a version control, check-in/check-out, or approval process or workflow has been enabled for the library, it will all work automatically. Office clients even support single sign-on with SharePoint and SharePoint Online Dedicated.

Office documents in SharePoint libraries are easy to search and you can even generate an instant SharePoint workspace to enable teams to collaborate on a particular document. Multiple users can open the same Word document and edit it at the same time, and users can see the changes being made by other users almost instantly.

When migrating Notes documents to the Microsoft Word format, you can migrate to simple unadorned Word documents or to custom Word templates. You can also migrate Notes data items to Word document properties or even to content controls in your custom template.

The following screen shot shows a Notes rich text document that was converted to a Word document on a custom letterhead template and checked into a SharePoint document library:

image

For more details on migrating to Word documents, see these posts.

PDF

PDF is another popular choice for migrating rich text Notes documents. Many organizations, especially in Europe, use PDFs to archive old content. Since PDF is now an open standard, the assumption is that there will always be a PDF reader such as Adobe Acrobat available in the future. An organization that has a large number of Notes databases with rich text documents may find that PDF is an ideal target format for many of them.

When PDF documents are placed in a SharePoint library, the integration is not quite as tight as it is with Office applications, but the user experience is still reasonable. Even though PDF readers and editors are not generally “SharePoint aware,” the experience of opening PDF documents is similar to downloading them from any web site. PDF documents can work with SharePoint’s search features, but you need to install a free add-on from Adobe for SharePoint’s full-text search indexer to read the content.

A word of warning about migration tools: a number of tools advertise the ability to convert Notes documents to PDF documents, but deliver poor results. If you plan to use this feature, we strongly recommend that you test the tools with your user’s most complex documents. Watch for how nested tables, embedded images, links to attachments and doc links are handled.

The following figure shows a Notes rich text document that was converted to a PDF document and checked into a SharePoint document library:

image

PDF migrations are discussed in more detail here.

Document rendering

In some cases, simply converting the rich text bodies of your Notes documents to Word or PDF files is not good enough because it does not capture the rich form layout that Notes users are used to. Without the form layout, you really haven’t captured all the information.

This is where an advanced concept known as document rendering comes in. With this technique, the migration tool “renders” each document with its original Notes form to generate a new rich text document that includes the entire form layout you had in Notes. To visualize this, compare the generated PDF document in the following figure with the PDF example provided above:

image

In addition to the rich text body, we captured the information that was presented with the original Notes form. Most significantly, we did not have to redevelop the form in SharePoint to accomplish this, nor did we have to explicitly map all the data fields that are displayed in the form header.

We used PDF in this example, but the same concept of rendering documents with forms applies equally well to Word documents, InfoPath documents, pages and even lists.

The Render With Form feature is discussed more fully here.

InfoPath

People sometimes choose the InfoPath document format when they want to migrate complex Notes applications to SharePoint, usually for one of two reasons: either the applications have complex data structures that do not lend themselves to being stored in a SharePoint List, or the applications have complex form designs that contain dynamically hidden sections, input validation rules, buttons, form events, or other sophisticated form logic. Ways of addressing the second issue are discussed in the section “Migrating Application Designs” below, so for now, we will focus on the migration of Notes content to InfoPath documents.

InfoPath data documents (traditionally called “InfoPath forms”) are really XML files that you edit using the InfoPath client (part of Office) or perhaps in a browser if your SharePoint server is running InfoPath Form Services. You specify the layout and behavior of your InfoPath form (and associate it with your desired XML schema) by creating an InfoPath form template. Typically you would store your XML data documents in a special type of SharePoint library known as a form library. This library is associated with one or more form templates in such a way that end users get a fairly seamless experience of creating, viewing and editing complex documents right from the library.

When performing a Notes migration, your main job is to convert Notes documents to InfoPath XML data documents according to a particular XML schema and check them into a form library.  Notes Migrator for SharePoint makes this fairly straightforward.  All you need to do is load in an InfoPath form template and specify how you want various Notes data elements to map to the various parts of your new XML schema.  These elements might include not only simple data fields, but also rich text, embedded images and attachments, links to external attachments, and links to other documents.  As XML schemas are not necessarily one dimensional, you may need to map one-to-many data items as well (for example, a product description document might contain multiple distributor names).

The following screen shot shows a Notes rich text document that was converted to an InfoPath document (associated with a specific InfoPath form template) and checked into a SharePoint form library:

image

NOTE:  We did not specify how the InfoPath form template was created. It may have been created from scratch by your InfoPath developer or you may have migrated an existing Notes form using Notes Migrator for SharePoint.

For more details on InfoPath migrations, look here.

Details, details, details

Additional considerations for migrating to SharePoint libraries include:

  • Notes documents contain metadata such as Created By, Created Date, Last Modified By, and Last Modified Date. Many migration tools drop this metadata during migrations to SharePoint, resulting in a major loss of business data.
  • Most Notes databases contain access control lists, which determine what specific users can do in a particular application. In addition, individual documents contain access restrictions such as Readers lists and Authors lists. Access definitions may use groups in the Domino Directory as well as roles defined for the database. Preserving all this information correctly in SharePoint may be critical to a successful migration of sensitive data.
  • If your Notes application has a concept of document versioning, make sure your migration tool allows you to correctly map the versioning to SharePoint versioning. All versions of a given document in Notes should appear to be versions of the same document in SharePoint.
  • It is common to want to assign folders during a migration. You can dynamically generate folder names in SharePoint based on data extracted from Notes.
  • Instead of folders, you may want to use a powerful new SharePoint feature called Document Sets. This feature is discussed in more detail here
  • When generating documents in a document library, you may with to put images and attachments in a subfolder or even in a different library.  This technique is discussed here.
  • For complex applications with mixed content, may wish to assign Content Types to documents as you migrate them.
  • Lists, Libraries and Pages

    There are three basic ways to store content in SharePoint: lists, libraries, and pages. Each of these has a number of interesting variations, but it is important to understand the differences between these three fundamental types so you can best decide what you want to migrate to. Each type is described briefly here; subsequent posts will explain in detail how to migrate content to each from Lotus Notes applications.

    Lists

    Lists are similar to tables in a relational database. A list is a flat collection of data records (called items in SharePoint) with a fixed set of data fields (called columns). Each data column has a fixed name and type. For example, a customer list may have a Text column called “Customer Name,” a Date column called “First Purchase Date,” and potentially dozens of other columns. One particularly interesting column type is Rich Text (also known as a “Note” “Body” column); this is where one would typically store large amounts of rich text. Lists can also have one or more binary attachments and may have one or more views, which allow users to select and sort the items in various ways.

    All of this should sound pretty familiar to Notes customers, because a list is actually the closest thing in SharePoint to a Notes database. The biggest difference is that SharePoint lists are highly structured with a fixed schema (like a relational database), whereas Notes databases can be very unstructured, with every document having a different set of data items.

    Libraries

    Libraries are collections of binary files, such as images, Word documents, or audio clips. While lists and libraries are very similar internally, the metaphor is very different: in a list, the document may contain several binary file attachments; in a library, the binary file is the document. The emphasis in libraries is the document management functionality, including versioning and check-in/check-out. As with lists, libraries can have many additional data columns defined for capturing additional information about each document.

    In the Notes world, the closest thing to a SharePoint library is a Domino.Doc file cabinet. (Domino.Doc was a popular document management system built on top of Notes.) Many organizations also built custom Notes applications that attempt to implement document management functionality. Any time you see a Notes application where the file attachment is “the document,” consider migrating it to a SharePoint library. It is also common for Notes “team site” applications to have a document library section as part of the overall application.

    Pages

    Pages are the building blocks of all SharePoint sites. These are the web pages you actually see in the web browser every time you click on a link to view a site, open a document, enter some information, or do just about anything else. Most people do not realize that the same pages that make up the sites themselves can also be used as data documents. SharePoint actually allows you to create several types of content pages, including basic pages, wiki pages, web part pages, and publishing pages. SharePoint Online Dedicated includes several nice publishing site templates that are designed to manage the authoring, approval and display of rich text web pages.

    While content pages have no exact equivalent in the Notes world, they can be a great way to migrate certain types of Notes applications. Any time you see a Notes application in which the main intent was to publish a library of rich text pages to a large number of users, consider migrating it to a SharePoint page library or a publishing site. This includes the many Notes applications that implemented public web or extranet sites.