Translator framework for Wysiwyg editors
Support for the integration of WYSIWYG (What-You-See-Is-What-You-Get) editors. On its own, the only thing this plugin gives you is a stand-alone HTML to TWiki translator script. For WYSIWYG editing in TWiki, you will also need to install a specific editor package such as TWiki:Plugins.KupuEditorContrib
This plugin provides a generic framework that supports editing of TWiki topics using any browser-based HTML editor. It works by transforming TML (TWiki Meta Language) into HTML for the editor, and then transforming HTML back into TML on save.
- Supports the input of malformed HTML
- Full round-trip (TML -> XHTML -> TWiki syntax)
- Framework is editor-agnostic
What's in the package
The package includes the following pieces:
- TML (TWiki syntax) to HTML translator
- HTML to TML translator (with stand-alone script)
- Generic TWiki plugin for automating the translation during editing
How it works
The plugin works by translating the topic text into HTML when someone edits a topic. The HTML is then fed to the WYSIWYG editor. On save, the edited HTML is run through the reverse translation before saving to the topic. TWiki syntax is used in preference to HTML in the stored topic wherever possible, though HTML may be used if the translator can't find a suitable TML equivalent..
The default rendering that TWiki uses to generate HTML for display in browsers is 'lossy' - information in the TWiki syntax is lost in the HTML output, and a round-trip (recovering the original TWiki syntax from the HTML) is impossible. To solve this problem the plugin instead uses its own translation of TWiki syntax to XHTML. The generated XHTML is annotated with CSS classes that support the accurate recovery of the original TWiki syntax.
Before you ask the obvious question, yes, the translator could be used to replace the TWiki rendering pipeline for generating HTML pages. In fact, the translator is taken almost directly from the implementation of the rendering pipeline for the TWiki-4 release
Translation of the HTML back to TWiki syntax uses the CPAN:HTML::Parser
. This parser is used in preference to a more modern XML parser, because the WYSIWYG editor may not generate fully compliant XHTML. A strict parser would risk losing content. CPAN:HTML::Parser
is better at handling malformed HTML.
There is also the advantage that the translator can be used to import
HTML from other sources - for example, existing web pages. Due to the simple nature of TWiki syntax and the potential complexity of web pages, this translation is often lossy - i.e there will be HTML features that can be entered by editors that will be lost in this translation step. This is especially noticeable with HTML tables.
Using the translators from Perl scripts
Both translators can be used directly from Perl scripts, for example to build your own stand-alone translators.
A stand-alone convertor script for HTML to TWiki is included in the installation. It can be found in
Integrating a HTML Editor
The plugin can be used to integrate an HTML editor in a number of different ways.
- The HTML for the content-to-be-edited can be generated directly in the standard edit template.
- The HTML for the content-to-be-edited can be generated directly in a specialised edit template.
- A URL can be used to fetch the content-to-be-edited from the server, for use in an IFRAME.
- REST handlers can be called from Javacript to convert content.
Generating content directly in the standard edit template
This is the technique used by WYSIWYG editors that can sit on top of HTML
textareas, such as TinyMCE. The topic content is pre-converted to HTML before inclusion in the standard edit template. These editors use plugins that have a
. These handlers are responsible for the conversion of topic text to HTML, and post-conversion of HTML back to TML.
- User hits "edit".
- Editor-specific plugin
beforeEditHandler converts topic content to HTML by calling
- User edits and saves
- Editor-specific plugin
afterEditHandler converts HTML back to TML by calling
- WysiwygPlugin should not be enabled in
WYSIWYGPLUGIN_WYSIWYGSKIN should not be set.
- Your plugin should set the
textareas_hijacked context id, to signal to skins to suppress their textarea manipulation functions.
This is the recommended integration technique, if your editor can support it.
Generating content directly in a specialised edit template
This technique is useful when the editor requires the topic content in a variety of different formats at the same time. In this scenario the editor uses a custom edit template. The WYSIWYG content is made available for instantiation in that template in a number of different formats.
be set for this to work.
The flow of control is as follows:
- User hits "edit" with the skin (or cover) set the same as
- The WysiwygPlugin
beforeEditHandler determines if the topic is WYSIWYG editable, and vetos the edit if not by redirecting to the standard edit skin. the edit
edit template containing the JS editor is instantiated.
- The following variables are available for expansion in the template:
%WYSIWYG_TEXT% expands to the HTML of the content-to-be-edited. This is suitable for use in a
- User edits and saves
afterEditHandler in the WyswiygPlugin sees that
wysiwyg_edit is set, which triggers the conversion back to TML.
- The HTML form in the edit template must include an
wysiwyg_edit and set it to 1, to trigger the conversion from HTML back to TML.
WYSIWYGPLUGIN_WYSIWYGSKIN must be set to the name of the skin used for WYSIWYG editing. This is usually the name of the editor e.g.
Fetching content from a URL
In this scenario, the edit template is generated without
the content-to-be-edited. The content is retrieved from the server using a URL e.g. from an
The flow of control is as follows:
- As Generating content directly in a specialised edit template
- As Generating content directly in a specialised edit template
- As Generating content directly in a specialised edit template
- When the document loads in the browser, the JS editor invokes a content URL (using an
IFRAME or a
XmlHttpRequest) to obtain the HTML document to be edited
- The content URL is just a TWiki
view URL with the
wysiwyg_edit parameter set.
- The WysiwygPlugin recognises the
wysiwyg_edit parameter and uses the TML2HTML? translator to prepare the text, which is then returned as
text/plain to the browser.
- Two TWiki variables,
%OWEB% and %OTOPIC%=, can be used in the content URL in the edit template to refer to the source topic for the content.
- After edit handling is as for Generating content directly in a specialised edit template
Editors can use
to perform saves, by POSTing to the TWiki
script with the
parameter set to
. This parameter tells the
in the WysiwygPlugin to convert the content back to TML. See TWikiScripts
for details of the other parameters to the
Once the save script has completed it responds with a redirect, either to an Oops page if the save failed, or to the appropriate post-save URL (usually a
). The editor must be ready to handle this redirect.
Attachment uploads can be handled by URL requests from the editor template to the TWiki
script normally redirects to the containing topic; a behaviour that you usually don't want in an editor! There are two ways to handle this:
- If the uploads are done in an
IFRAME or via
XmlHttpRequest, then the 302 redirect at the end of the upload can simply be ignored.
- You can pass
noredirect to the
upload script to suppress the redirect. In this case you will get a
text/plain response of
OK followed by a message if everything went well, or an error message if it did not.
to convert content from TML to HTML and back again.
The plugin defines the following REST handlers:
Converts the HTML text to TML.
Converts the TML text to HTML.
be specified. The response is a
page of converted content.
Plugin Installation Instructions
You do not need to install anything in the browser to use this extension. The following instructions are for the administrator who installs the extension on the server where TWiki is running.
Like many other TWiki extensions, this module is shipped with a fully
automatic installer script written using the BuildContrib.
- If you have TWiki 4.2 or later, you can install from the
configure interface (Go to Plugins->Find More Extensions)
- If you have any problems, then you can still install manually from the command-line:
- Download one of the
- Unpack the archive in the root directory of your TWiki installation.
- Run the installer script (
perl <module>_installer )
configure and enable the module, if it is a plugin.
- Repeat for any missing dependencies.
- If you are still having problems, then instead of running the installer script:
- Make sure that the file permissions allow the webserver user to access all files.
- Check in any installed files that have existing
,v files in your existing install (take care not to lock the files when you check in)
- Manually edit LocalSite.cfg to set any configuration variables.
Plugin Configuration Settings
can be set to make the plugin sensitive to what is in a topic, before allowing it to be edited. You can set it up to veto an edit if the topic contains:
html - HTML tags (e.g.
<div>, not including <br>), or
variables - simple variables (e.g.
calls - TWiki variables with parameters e.g.
pre blocks (
If the plugin detects an excluded construct in the topic, it will refuse to allow the edit and will redirect to the default editor.
If you excluded
, you can still define a subset of TWiki variables that do not
block edits. this is done in the global
, which should be a list of TWiki variable names separated by vertical bars, with no spaces, e.g:
* Set WYSIWYG_EDITABLE_CALLS = COMMENT|CALENDAR|INCLUDE
You should set
, or in WebPreferences
for each web.
You can define the global variable
to stop the
plugin from ever trying to convert specific HTML tags into
HTML when certain specific attributes are present on the tag. This is most
useful when you have styling or alignment information in tags that must be
This variable is used to tell the translator which attributes, when present
on a tag, make it "stick" i.e. block conversion. For example, setting it to
will stop the translator from trying to
tag that has
attributes, and any
tag that has a
You can use perl regular expressions to match tag and attribute names, so
will ensure that any tag with an
event handler is kept as HTML.
The default setting for this variable is:
If you edit using the plain-text editor, you can use the <sticky>..</sticky> tags to delimit HTML (or TML) that you do not
want to be WYSIWYG edited.
if you are using your own before/after edit handlers, you can call
to check these controls.
Incompatible with "non-standard" syntax
is incompatible with plugins that expand non-standard syntax e.g. TWiki:Plugins.MathModePlugin
Plugins that extend the syntax using TWiki variables, such as
, should work fine.
Because TWiki uses a "best guess" approach to some formatting, it allows overlapping of tags in a way forbidden by HTML, and it is impossible to guarantee 100% that formating in the original TWiki document will still be there when the same document is loaded and then saved through the WysiwygPlugin
. The most obvious case of this is to do with styles. For example, the sentence
*bold _bold-italic* italic_
is legal in TML, but in HTML is represented by
<strong>bold <em>bold-italic</em></strong> <em>italic</em>
which gets translated back to TML as
*bold _bold-italic_* _italic_
which is correct by construction, but does not render correctly in TWiki. This problem is unfortunately unavoidable due to the way TWiki syntax works.
This plugin is brought to you by a WikiRing
partner - working together to improve your wiki experience!
Many thanks to the following sponsors for supporting this work:
Related Topics: TWikiPreferences
| Plugin Authors:
|| TWiki:Main.CrawfordCurrie http://www.c-dot.co.uk
|| © ILOG 2005 http://www.ilog.fr
|| GPL (Gnu General Public License)
| Plugin Version:
|| 15536 (08 Nov 2007)
| Change History:
| 8 Nov 2007
|| Bugs:Item4923: fixed blocking of table conversion due to empty attributes Bugs:Item4936: An em embedded in an em was getting eaten Bugs:Item4817: added typewriter text button Bugs:Item4850: added font colour controls
| 2 Nov 2007
|| Bugs:Item4903: corrected over-enthusiastive interpretation of ! as an escape
| 21 Oct 2007
|| Bugs:Item4788: fixed unbalanced protect, which could cause loss of protected status Bugs:Item4811: noautolink looks like an HTML construct but in fact is not; the tag is infact an "on-off" switch and does not imply any HTML structure, so cannot be converted to a DIV or a span, so has to be removed. Bugs:Item4747: added <sticky> to try to overcome limitations in translation Bugs:Item4831: added increased flexibility in deciding what HTML get converted to TML, and what does not. Analysed all the HTML4 tags to establish initial settings. Bugs:Item4847: don't call non-existent function with older HTML::Parser releases Bugs:Item4844: Saving a table from IE didn't convert it back to TML Bugs:Item4855: table rows generated from TWiki variables were being eaten
| 6 Oct 2007
|| Bugs:Item4700: fixed colspans Bugs:Item4701: removed extra line between %TABLE and the table Bugs:Item4705: fixed spacing around literal and verbatim blocks Bugs:Item4706: merge adjacent verbatim blocks separated only by whitespace Bugs:Item4712: fixed eating of noautolink and literal Bugs:Item4763: list items spanning multiple lines fixed Bugs:Item4867: released tml2html
| 17 Sep 2007
|| Bugs:Item4647: Bugs:Item4652: problems related to DIV fixed. Bugs:Item4653: fixed multi-line twiki variables
| 16 Sep 2007
|| Bugs:Item4630: polished up the way the secret string is done, to ensure synch between perl and JS. Item4622: added UTF-8 handling steps that fixup malformed UTF8 strings before presenting them to the editor (saves Moz) and stops the editor passing them back to TWiki (saves IE). Removed extra entity decoding steps that were causing problems. Bugs:Item4629: fixed issues with verbatim, highlighted by previous mangling of this topic
| 13 Sep 2007
|| Bugs:Item4613 cleaned up spurious message when navigating away Bugs:Item4615 fixed incorrect rendering of emphasis next to br
| 12 Sep 2007
|| Bugs:Item4604 Fixes to REST handler, and add ability to trigger HTML2TML? conversion from a content comment alone (required for some editors) Bugs:Item4588 fixes to conversion of double-character emphases
| 7 Sep 2007
|| Bugs:Item4503 excess empty lines Bugs:Item4486 no toc headers with unofficial syntax Bugs:Item4560: empty lines lost Bugs:Item4566: corrupted table on save Bugs:Item4550 section tags being eaten
| 4 Sep 2007
|| Bugs:Item4534 Bugs:Item4535 fixed
|| Bugs:Item4481 Bugs:Item4524 fixed
|| Bugfixes and refinements done as part of 4.2 release
|| Bugs:Item4435 - further fixes to newlines; this time to remove excess BRs
|| Bugs:Item4453 - noautolink was breaking lists
|| Support for entry of TML constructs such as Set
|| Extensive rewrite to remove a lot of the "clever bits" that try to be too smart with TML and HTML embedded in the topic. By simply passing that through as editable text, we defer to the client side the choice of doing something clever with it, which makes the whole editing experience an awful lot better.
|| Made use of HTML::Entities optional; fixed encoding problem with nbsp on IE; Bugs:Item1742 support for simultaneous edits (post edit merge)
|| Added support for embedded editable HTML in the edit template
|| Split into WysiwygPlugin and KupuContrib
|| Minor doc updates, minor fixes to spacing in lists, integrated Koen Marten's template topic patch
|| Bugs:Item2025: corrected handling of SPAN and FONT tags used for colour changes
|| Bugs:Item1890: doc update
|| Bugs:Item1890: Bugs:Item1041: Bugs:Item944: Much more aggressive cleanup of HTML pasted in from external sources. Excessively verbose HTML (e.g. from Outlook) was causing apparent infinite looing behaviour.
|| Bugs:Item1176: commented out Cairo version of header handler
|| Bugs:Item1625: disable expansion of twiki variables in urls where there are other twiki variables that can't be expanded
|| Bugs:Item1530: support for templatetopic when editing new topics
|| Bugs:Item1532: WysiwygPlugin: Added two more do-not-edit-if-topic-contains parameters, pre+comments
|| Bugs:Item1532: WysiwygPlugin: Kenneths suggestion on proper handling of HTML comments (incl. change to kupu)
|| Bugs:Item1529: evil, evil. The XMLSerializer in IE isn't happy serializing the DOM. I have no idea why. Kupu manages to get away with this because it passes the DOM through the XML validator, which I had to disable because it strips comments. So, for now, the IE implementation will strip comments - but at least you can save again
|| Bugs:Item1501: table handling was a bit spazzy. Several problems fixed.
|| Bugs:Item1518: moved icon and string lists into topics, updated screenshot
|| Bugs:Item1392: reversed the sense of the navigate-away condition, again
|| Bugs:Item1486: added WYSIWYG_EXCLUDE to allow exclusion of 'uneditable' content
|| Bugs:Item1486: was stripping comments, wrongly. Had to disable the kupu filters completely, they just do too much damage.
|| Bugs:Item1457: corrected problem with bullet list at top of topic
|| Bugs:Item1445: small usability improvements
|| Bugs:Item663: TWiki.org doc merge: Fix incorrect link to kupu website
|| Bugs:Item1411: handle case of the result of a TWiki variable being nopped
|| Bugs:Item1317: wrong result returned from generation function when expanding HTML embedded in verbatim block
|| Bugs:Item1397: removed excess space after sqaub links
|| Bugs:Item1231: added %SPAN% to indicate a spanned-over cell in the editor. Improved handling of HTML in verbatim tags by inserting line breaks is the tag type calls for it, before removing the HTML.
|| Bugs:Item1215: added WYSIWYG_ICONS and WYSIWYG_TAGS to support user customisation of icon images and twiki variables that can be inserted
|| Bugs:Item1314: debugging in case the hang happens again; and made sure to default the editor just in case
|| Bugs:Item1315: short forms must be terminated by one of the same characters that terminate wikiwords
|| Bugs:Item1391: added special interpretation of IMG tags to expand selected TWiki variables within SRC attributes
|| Bugs:Item1340: refined handling of NOP to cover abbrevs
|| Bugs:Item1311: removed excess space inserted in headings
|| Bugs:Item1339: changed from using arbitrary attribute for notoc to a new CSS class. Arbitrary attributes are stripped by Kupu before save.
|| Bugs:Item1344: strip ^Ms inserted by Sarissa during serialisation on IE
|| Bugs:Item1394: still can't get text styles to work properly in IE; but I am now firmly of the opinion that the fault lies with the browser, and not with Kupu.
|| Bugs:Item1341: added appropriate CSS class
|| Bugs:Item1313: added caveat about editing complex HTML and mixed HTML-TML
|| Bugs:Item1334: headers not handled properly in Cairo version
|| Bugs:Item1318: corrected table/list parser for tables embedded in bulleted lists
|| Bugs:Item1310: support for <nop/>
|| Bugs:Item1317: support for limited case of nopped variable
|| Bugs:Item1320: corrected interpretation of relative URL path in []
|| Bugs:Item1259: changed comment handling; rather than trying to create HTML, which gets munged, create an HTML comment. This will only be editable by switching to source view, but hey, it's supposed to be WYSIWYG. Note that this also means that comments in pasted HTML should be retained now
|| Bugs:Item1042: spec of SCRIPTURL changed
|| Bugs:Item1189: reverting accidental checkin of experimental code
|| Bugs:Item1189: filter whitelist is not good enough; need to generate B and I nodes. templates/ pub/TWiki/WysiwygPlugin
|| Bugs:Item1189: it took bloody ages to track down, but finally discovered that bold and italic were being filtered out of spans by Kupu 1.3.2.... too smart for it's own good. So added them to the filter whitelist, and it works again.
|| Bugs:Item1189: added pre save filter to try and find where the attributes are disappearing to in FF
|| Bugs:Item1187: for lack of an s on an RE, the nation was lost (well, the multi-line comment actually). Thanks Kenneth!
|| Bugs:Item859: solved issue with non-display of inserted images. Was due to the use of an onSubmit handler to close the dialog, rather than an onLoad handler triggered when the IFRAME that contains the result is loaded.
|| Bugs:Item1172: had to rewrite big chunk of the table popup to get it working with 1.3.2
|| Bugs:Item1151: rewrote link handlings stuff to leverage browser better
|| Bugs:Item1175: escape wikiwords within squabs
|| Bugs:Item1158: works for Cairo now as well
|| Bugs:Item1158: first implementation of AJAX interface to allow selectoin of topics from other webs
|| Bugs:Item1154: removed non-existent scull.gif
|| Bugs:Item1155: added extra recursion block, as Item1155 suggests it is needed
|| Bugs:Item1042: All sorts of clever tricks to handle expansion/compression of a subset of TWiki variables when they are used in URLs. Not a complete solution, but better than it was.
|| Bugs:Item1024: caught out by recursive call to beforeCommonTagsHandler in Cairo (nasty)
|| Bugs:Item1042: whoops, broke \t conversion in Cairo
|| Bugs:Item1140: testcase for 1140
|| Bugs:Item1140: fix rewriting of img src urls (and updated MANIFEST for Kupu1.3.2)
|| Bugs:Item1042: extensive improvements to variable and URL recognition and conversion
|| Bugs:Item856: added doc on EDIT_SKIN to the plugin
|| Bugs:Item1074: upgrade to Kupu 1.3.2 complete (at last)
|| Bugs:Item1074: Fixed source edit mode
|| Bugs:Item1074: tidied up broken toolbar. There are still known issues
|| Bugs:Item1074: first pass at moving to Kupu 1.3.2.
|| Bugs:Item1037: insert wikiword only if selection is zero length
|| Bugs:Item977: changed to remove dangerous Cairo-based assumption, and use context ids instead
|| Bugs:Item1025: added 'escape clause' for old handlers implemented to support old TWiki releases without warnings
|| Bugs:Item941: Eliminated the last of the dynamic globals to try and solve saving problem. Can;t test with mod_perl, but is fine with speedycgi AFAICT
|| Bugs:Item873: minor issue; replace br with \n in pre
|| Bugs:Item873: obvious problem parsing closing pre tag on same line as open tag
|| Bugs:Item710: Handling HTML comments
|| Bugs:Item876: Item945: Item876: spacing around table cells, correct handling of variables. Had to compromise on handling [] but I think it's for the best.
|| Bugs:Item871: made sure that brackets are generated for non-wikiwords
|| Bugs:Item928: removed special interpretation of mailto links
|| Bugs:Item866: extended URL parsing to handle MAINWEB and TWIKIWEB twiki variables, in the same hacky way as the core.
|| Bugs:Item870: a couple of corner-cases for correct handling of twiki variables
|| Bugs:Item899: changed list generation to use spaces instead of tabs
|| Bugs:Item180: removed pointless, outdated dependency check from DateFieldPlugin?
|| Bugs:Item622: reverted 3 specs to tabs in Set lines in plugins topics for kompatterbility with Kigh-roe
|| Bugs:Item622: tabs -> 3 spacesto avoid confusing the users
|| Bugs:Item638: added instruction to run configure to all install docs (I hope)
|| Bugs:Item569: added default RELEASE to everything that had a version, and removed a load of dead code that was getting in the way
|| Bugs:Item569: computed version numbers for plugins from the repository rev they were built from.
|| Bugs:Item436: incremented vernos of all changed plugins
|| Bugs:Item429: trying to make access controls clearer
|| Bugs:Item340: re-initialisation bug found by ColasNahaboo? when using mod_perl; fixed by correctly re-initialising the parse stack for each run of the convertor
|| Bugs:Item340: Release 0.16 of WysiwygPlugin
|| Bugs:Item340: bugfixes for release 0.16 of WysiwygPlugin
|| Bugs:Item335: Switched PNGs to indexed mode, as transparency doesn't work on IE for RGB images
|| Bugs:Item332: Added context identifier to WysiwygPlugin, and a button to the pattern view template. If WysiwygPlugin is enabled, then the button will appear. Neat, huh?
|| Bugs:Item196: getting plugin test suites to pass. Doesn't mean the plugins actually work, just that the test suites run (which is a good indicator)
|| Bugs:Item168: checkpoint checking for 0.16
|| Bugs:Item186: more minor updates
|| Bugs:Item168: new icons, and a couple of bugfixes, to WysiwygPlugin
|| Bugs:Item196: more plugin and contrib fixes for develop; mainly just moving tests around and making sure they all pass.
|| Bugs:Item138: had to change to using beforeCommonTagsHandler and also escape % signs to prevent TWiki from rendering internal tags (as reported by Colas)
|| Bugs:Item168: corrected stupid error on IE; added screenshot
|| Bugs:Item168: release 0.13
|| Bugs:Item168: nearly ready for 0.13
|| Bugs:Item168: corrected images, twikified all images
|| Bugs:Item168: the import from cvs has screwed images
|| Bugs:Item168: twikified icon images, and renamed some images to be more intention-revealing
|| 0.12 beta release
|| Tidied up installer, documentation. Release 0.10
|| pre-release 0.06
|| Version 0.05
|| Checkpoint checking - version 0.03
|| cvsrmtee old files
|| Check in for prototype release
|| Check in for prototype release
|| Most of the toolboxes are working again
|| Initial commit; doesn't do much except run tests
|HTML::Parser||>=3.28||Required. Available from CPAN.|
|HTML::Entities||>=1.25||Required. Available from CPAN.|
| Plugin Home: