Skip navigation.

The Worst Book on .NET PatternsAll recent postsI Love Holidays!

Announcing Markdown.NET

I've been debating this in my mind for a while now: How do I allow my readers to submit comments that contain HTML and yet maintain control over the markup? It's easy enough to miss a closing tag and skew the entire page. If you read comments below an older post, A CMS Plugin Wanted, you'll see that this is a pretty complicated issue which has no cut-and-dry answer.

Initially, I decided to go to an extreme and strip all HTML tags except anchors. It was simple and secure: nobody would've injected malicious code with comments to exploit the unfortunate ones using buggy Internet Explorer. A number of people tried to paste some ASP.NET code (which is spaghetti-looking HTML) only to notice that nothing showed up.

Enter Markdown

Last week I turned to Markdown, a project run by John Gruber. In his own words:

Markdown is a text-to-HTML conversion tool for web writers. Markdown allows you to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML).

Thus, "Markdown" is two things: (1) a plain text formatting syntax; and (2) a software tool, written in Perl, that converts the plain text formatting to HTML.

Please see Markdown basics and Markdown syntax to get a glimpse of what this is about.

John wrote it in Perl, and Michel Fortin ported it to PHP. A bunch of folks helped out along the way with ideas and testing. It's a truly impressive undertaking.

I started looking around for a .NET port, but couldn't find one, and therefore decided to port it myself. In fact, I ported the PHP implementation because I felt it was closer to C# than Perl. It was a bigger undertaking than I anticipated, partly because C# doesn't allow for some of the jiggery-pokery with regular expressions and I had to use match evaluators right and left. In retrospect, I like how C# handles those cases better because the code turned out a little more readable, IMHO.

Why Bother?

  1. I believe Markdown kills two birds with one stone: it allows you to embed HTML snips and format text outside of them at the same time.
  2. I think Markdown.NET, as I dubbed it, would be of benefit to those who run ASP.NET-based blogs, forums, sites. The component is written in C# and compiled into a .NET assembly.

Help Test Please

John Gruber runs an online Markdown converter. Michel Fortin runs a PHP one. To follow in the footsteps I've made available a Markdown.NET converter in my Tools section.

I will appreciate any help in spreading the word about this project. I also need help testing it. I've done quite a bit of testing but I need more eyes and hands.

If you notice that a Markdown snippet is converted incorrectly, please email it to me and point out what's wrong. If you are a .NET developer and want to pinpoint the problem right in the code, the sources are available (see below).

Source Code

The latest code, ver 0.2 as of November 22, 2004, is available for download.

Comments

Comment permalink 1 Chris |
I'm glad to see an implementation of an existing mini-markup language , not a completely new one.
Comment permalink 2 David |
Nice idea, I was after something like this for my site as I wanted users to be able to save some text in their profile (database). Would it be possible to reverse the function, ie convert it back to non-HTML so it could be re-edited in the browser?
Comment permalink 3 Milan Negovan |
David, I don't know of a "reverse" convertor. John Gruber, the author of Markdown, would be the right person to ask.

What you can do is store Markdown text in the database and convert it only when you display it.
Comment permalink 4 Viet |
Very nice, is there any plans to do a WYSIWYG editor for Markdown?
Comment permalink 5 Karls |
This is really nice to see. I have been struggling to find ways by which I could include standards-based methods for implementing this in my .Net applications.

I will give it a gander. Thanks and good job!
Comment permalink 6 Milan Negovan |
Viet, I wonder if any of the control designers would want to plug it into their controls. There are several .NET rich edit controls around.
Comment permalink 7 SomeNewKid |
Fantastic!

I love the concept behind Markdown, and had been planning to create my own .NET version.

My most sincere thanks for making this available, Milan.
Comment permalink 8 Karls |
How can one use this in an ASP.Net application written in VB? Sorry, I do not know much C#.
Comment permalink 9 Milan Negovan |
The source code is there in case you want to see how everything works and help pinpoint bugs (should you find any).

I included a compiled assembly (anrControls.Markdown.NET.dll). All you need to do it add a reference to it in VS.NET and it doesn't matter if you're developing in VB.NET or any other .NET language. That's the good thing about managed code. ;)
Comment permalink 10 Nick |
Excellent job! I can't wait to get home and get this implemented. I have also been wanting a .NET version of this and had implemented some of the basics, but hadn't had time to finish it, so thanks for saving me a ton of time!!!! Keep up the great work!

Cheers,

Nick
Comment permalink 11 Nick |
Donated :).
Comment permalink 12 Milan Negovan |
Thank you for your support. ;)
Comment permalink 13 Stephen Haberman |
Re: David, Aaron Swartz wrote an HTML to Markdown tool in Python.

http://www.aaronsw.com/2002/html2text
Comment permalink 14 Pete Bevin |
Rather than converting back and forth between Markdown and HTML, I prefer to keep the markdown source as-is in the database, and just use HTML for output. That way, authors can always go back and edit the original markdown as they typed it.
Comment permalink 15 MBF |
Thank you
Comment permalink 16 Kenneth |
Excellent work on porting Markdown to C#, but I fail to see how it solves the problem of commenters invalidating your markup by missing a closing tag. If I've missed something, I'd be grateful for an explanation.
Comment permalink 17 Milan Negovan |
Kenneth, it doesn't really prevent anyone from missing a closing tag or something like that. Markdown is more of a syntax, which is supposed to make text entry easier (with a little learning curve). To enforce proper HTML formatting Tidy is a better fit.

Speaking of which, there's a Tidy port to .NET now (developed by someone else).
Comment permalink 18 Shamil |
Has anyone modified this to add br where there are line brakes. I have been looking for somthing to do this for ages and this is really good!! Thank you!
Comment permalink 19 Milan Negovan |
Shamil, feel free to pitch this idea to the author, John Gruber, or extend the component on your own.
Comment permalink 20 Michael Houston |
There seems to be a bug in the email address encoding: if you type, for example, < me@example.com > in the markdown test tool, and then press generate a couple of times, occasionally you just get scrambled text. It's not actually the same every time...
Comment permalink 21 Joakim Magnussen |
I've noticed that also (referring to Michael Houston's comment). And it happens because it randomly changes between returning hexadecimal, decimal and raw. The problem is that while randomly changing between hexadecimal and decimal, the only difference between them is an 'x', for example '&#6d;' and '&#x6d;', who are not the same. The error is on line 1202 and line 1204. And the error is that it doesn't actually convert the character to a hexadecimal value etc. But a very detailed description of solution follows:

Replace the contents of line 1202-1204 with the following line:
return string.Format ("&#x{0:x};", (int)c);

There you go! It couldn't be easier! All you've actually done is to remove the line that inserts "&#(some-number-returning-unexpected-result);", and inserting only "&#x(same-number,-but-now-it-gives-expected-result);".



And finally, a note for the author:
Users can still safely insert the dangerous "SCRIPT" HTML element.
Comment permalink 22 Joakim Magnussen |
I forgot to mention that removing that line was only a suggestion. It won't any longer be "10% raw, 45% hex, 45% dec", as the original author suggested. But 10 % raw and 90% hexadecimal should work fine, too.
Comment permalink 23 Milan Negovan |
Very cool! Thank you, Joakim, for feedback.
Comment permalink 24 Will |
Just found another option for reversing markdown... an XSL stylesheet!

http://www.lowerelement.com/Geekery/XML/XHTML-to-Markdown.html
Comment permalink 25 Milan Negovan |
Will, good stuff!
Comment permalink 26 Ray Akkanson |
I think this is a great tool.

Ray Akkanson

Emails and Notifications

Would you like to be notified when somebody responds to this post?  Would you like to have these comments emailed to you?

TrackBacks

Sorry, TrackBacks are not allowed.

Submit your comment

Please enter only text since all HTML tags except hyperlinks will be stripped. Hyperlinks will become live links. Any comments with flaming or offensive language will be deleted. Be courteous to other posters. Thank you.

Your name (required):
Your email (optional):
Your site's URL (optional):
Enter this number
Type in the number above:
Comment (required):