Friday, 3 December 2010

Relative URLs, the Rich Text Editor, and Reusable Content

We recently started leveraging the Reusable Content list to supply user-editable content to one of our interactive online forms and outbound emails. This allows the content owner to tweak the content as required as this content isn't locked away in custom code or resource files—no build and deploy required and the content management system is even used as such (whatdya know?!).

The Reusable Content list is provisioned by SharePoint when the Publishing feature is enabled; it allows users to define content snippets that can be maintained in a single location and updated automatically wherever they're used. You can optionally treat a reusable content snippet as a template—an editable copy is inserted into the page instead of a read-only, auto-updating view.

Reusable Content list items are pretty straightforward and, most importantly, contain a single HTML (rich text) field. SharePoint naturally displays its rich text editor around this field in edit mode so the edit user experience is similar to editing page content.

Unfortunately HTML fields in SharePoint are smarter than they should be and (in MOSS 2007), the product will mangle some content. I recently discovered it refuses to play with background-image style—they're silently removed whether they're inline or in an embedded stylesheet. (And yes, I know, inline styles are evil but this snippet was actually being plugged into an email so everything had to be self-contained).

Despite the tricks SharePoint plays on you with "managed" URLs, it seems the rich text field also stores URLs pointing to content within the current site as relative URLs. Absolute URLs are converted automagically but you'll never really see this until you pull the content out via the API.

We hunted for a way within SharePoint to convert all relative URLs in this content to absolute URLS but without much luck. I think there may be a Javascript function in one of the client scripts to do so for a chunk of content but there's nothing obvious in the server-side API.

To address this, we replace all relative URLs using the regex below (note the URL group) (is using regex to parse HTML bad? You be the judge). You may want to use SPUtility.GetFullUrl() to convert individual URLs.

@"(?:<\s*(?:a|img)\s+[^>]*(?:href|src)\s*=\s*[\""'])(?!http)(?<url>[^\""'>]+)[\""'>]"

If you found this post helpful, please support my advertisers.

2 comments:

  1. So we have a rich text editor that is in a form when that item gets approved it gets sent to users in an email. if sharepoint URLs are included in that editor they get converted to relative URLs, does this solve that issue?
    where do i use this regex in my form? can it be used in an infopath form?

    ReplyDelete
    Replies
    1. @Danny: the regex needs to be used in code--not sure how (if) you can do that in InfoPath but for our purposes it was done as the page was built/events fired, etc using the .NET regex API.

      Delete