I've been debating this in my mind for a while now: How do I allow my readers to submit comments that contain HTML and yet maintain control over the markup? It's easy enough to miss a closing tag and skew the entire page. If you read comments below an older post, A CMS Plugin Wanted, you'll see that this is a pretty complicated issue which has no cut-and-dry answer.
Initially, I decided to go to an extreme and strip all HTML tags except anchors. It was simple and secure: nobody would've injected malicious code with comments to exploit the unfortunate ones using buggy Internet Explorer. A number of people tried to paste some ASP.NET code (which is spaghetti-looking HTML) only to notice that nothing showed up.
Enter Markdown
Last week I turned to Markdown, a project run by John Gruber. In his own words:
Markdown is a text-to-HTML conversion tool for web writers. Markdown allows you to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML).
Thus, "Markdown" is two things: (1) a plain text formatting syntax; and (2) a software tool, written in Perl, that converts the plain text formatting to HTML.
Please see Markdown basics and Markdown syntax to get a glimpse of what this is about.
John wrote it in Perl, and Michel Fortin ported it to PHP. A bunch of folks helped out along the way with ideas and testing. It's a truly impressive undertaking.
I started looking around for a .NET port, but couldn't find one, and therefore decided to port it myself. In fact, I ported the PHP implementation because I felt it was closer to C# than Perl. It was a bigger undertaking than I anticipated, partly because C# doesn't allow for some of the jiggery-pokery with regular expressions and I had to use match evaluators right and left. In retrospect, I like how C# handles those cases better because the code turned out a little more readable, IMHO.
Why Bother?
- I believe Markdown kills two birds with one stone: it allows you to embed HTML snips and format text outside of them at the same time.
- I think Markdown.NET, as I dubbed it, would be of benefit to those who run ASP.NET-based blogs, forums, sites. The component is written in C# and compiled into a .NET assembly.
Help Test Please
John Gruber runs an online Markdown converter. Michel Fortin runs a PHP one. To follow in the footsteps I've made available a Markdown.NET converter in my Tools section.
I will appreciate any help in spreading the word about this project. I also need help testing it. I've done quite a bit of testing but I need more eyes and hands.
If you notice that a Markdown snippet is converted incorrectly, please email it to me and point out what's wrong. If you are a .NET developer and want to pinpoint the problem right in the code, the sources are available (see below).
Source Code
The latest code, ver 0.2 as of November 22, 2004, is available for download.