Beaten to death by lightweight markup

Over the past few years lightweight markup languages like Markdown and Textile have started to appear everywhere, often replacing some archaic terror from the blue, like BBCode. I have, for the most, seen this happening with a positive light, but recently Ive started to question the sense of using these languages:

  1. They are often ambiguous,
  2. and limited compared to HTML

In this article Ill be picking on Textile, not because I think its worse than Markdown, but because it is in my opinion better. Im also only going to give one example of each, not because I dont have more, but because it wouldnt add or detract from the point Im trying to make.

Ambiguous?

Perhaps the best example of how Textile can be ambiguous is when trying to add emphasis to a link, heres the correct way to do it:

*"Example":http://example.com/*
OR: "*Example*":http://example.com/

And heres what I often find myself fixing:

*"Example"*:http://example.com/

Whos going to say they where wrong? It seems perfectly reasonable that this should work, except that it doesnt.

Limited?

Indeed, while a lot of lightweight markup languages make a lot of sense for basic things, paragraphs, images, links and adding emphasis, you could even create a table:

|a|table|row|
|a|table|row|

Thats pretty simple, and easy, but what if you need a table header? Well, you cant.

Solution?

There hasnt really been an ideal solution to the problem of providing power and ease of use at the same time. You could keep the lightweight markup features and add to it support for HTML, but then youre faced with another problem you need to pay greater attention to sanitising user input, or youll end up with people killing your layout with a well placed </div>, which is part of the reason why lightweight markup languages exist in the first place.

In the process of building this site I created a HTML Formatter extension for Symphony that does just that It keeps the barest essentials that you get with a lightweight markup language, but allows you to use whatever HTML you need. You can give it basically anything for input, and itll give you back valid XML as output.

Essentially, its a two step process:

  1. Run HTML Tidy over the user input,
  2. Apply those lightweight markup features

Of course, this is grossly simplified, since were being careful to generate valid XML output, the entire thing must be done with DOM manipulation in PHP.

Anyhow, Id really like to hear other peoples thoughts on this matter, is this a good solution? Too hard, or not enough?

Thoughts and feedback

Max Wheeler 10 July 2009

Always hard to decide the best way to deal with user input, beyond creating a perfect WSYWIG editor this seems like a good solution.

Perhaps the problem of ambiguity can be mediated by allow people to easily preview the output of their input. JavaScript live preview perhaps?

Rowan Lewis 13 July 2009

Max, the editor used on the Symphony site really does help, but only for the standard formats, if you want to do something special you really need to use HTML. If you add a livepreview to that, itd probably work pretty well, and it would also help encourage people to type valid HTML, because the preview wouldnt work if it was very broken.

I also meant to say that I dont like the idea of mixing Markdown or Textile with HTML, because to do it correctly you have to do a lot of crazy processing, consider the following user input:


This is *emphasised text <a href="http://example.com/*">with a link</a>*

When you have something written like that, you cant just use a simple regular expression to add emphasis between the two asterisks. You dont want to stop right in the middle of the @href because that'd cause invalid HTML to be created. So instead you have to treat the document as an XML tree from the very beginning, which makes tracking the asterisks a whole lot trickier.

It wouldnt be impossible, but itd probably make the terrible Textile code look like a wonderful oasis.

ZalmaNN 13 January 2010

Sorry about offtopic, but Symphony CMS is very impressive! Im going to grab my own copy :)

Statistics

This journal entry was written on 8 July 2009, and entombed beneath Interface, Opinions and Symphony.

  • four hundred and fifty words
  • three comments
  • three samples
  • four links

Categories

All journal entries on this site are organised by category, here are the most popular: