TOC (this page):

Top,
To Content,
Creating Tags,
Creating Documents,
Marking up Text,
Marking up Tables,
Generating XML,
Search and Replace,
XPath-based Editing

Vim as XML Editor: Tasks

Most of the examples and instructions will only work as shown when carried out in an environment set up as described in the previous chapters. But many of them might work with your settings when you adjust the various details. They are meant as illustrations of the respective general idea and not as perfect recipes.

Turn on XML mode via [mapleader] x, then enter insert mode via i.

To create a pair of tags enter the opening tag, you should get the closing one for free. Now you can enter content, for example text.

<foo>bar</foo>

If you want the content to be indented start again but this time hit > again after having closed the opening tag :

<foo>
  bar
</foo>

This way you can quickly create nicely indented XML documents:

<drinks>
  <juice>
    <mango>delicious</mango>
  </juice>
</drinks>

<chapter><title>Flowers</title>
  <section><title>Lilies</title>
    <para>...</para>
  </section>
  <section><title>Orchids</title>
    <para>...</para>
  </section>
</chapter>

Here is one way to create the above document :

[mapleader] x i < c h a p t e r > > [up] < t i t l e > F l o w e r s [down] < s e c t i o n > > [up] < t i [ctrl-x] [ctrl-p] > L i l i e s [down] < p a r a > . . . [esc] [down] o < s e c [ctrl-x] [ctrl-p] > > [up] < t i [ctrl-x] [ctrl-p] > O r c h i d s [down] < p a r [ctrl-x] [ctrl-l] [esc]

There are many different ways to create the same document. For example it would have been faster to create just one section skeleton and duplicate it.

Now you can fold away the section you don't work on, with 2 G z c:

<chapter><title>Flowers</title>
+---  3 lines: <section><title>Lilies</title>----------
  <section><title>Orchids</title>
    <para>...</para>
  </section>
</chapter>

To learn more about folding see :help folding and :help fold-commands.

With xmledit

Turn on XML editing mode via [mapleader] x.

To mark up the literal command name string in

<para>
Use cd to go to a different directory.
</para>

select it (eg via 2 G w v e), then type [mapleader] x in quick succession (eg \ x or , x). At the prompt, enter the name of the element, eg command, then press enter twice. You should get something like the following:

<para>
Use <command>cd</command> to go to a different directory.
</para>

With Recording

Although the above method is convenient it is not the best choice when there are more than one text portions to be marked up because it can't be repeated or automated easily. But Vim lets you record actions, the documentation describes this under "complex repeats", see :help recording.

Let's say you want to mark up the acronyms in

<para>These XSLTs can be used for transforming
DBX to XHTML.</para>

Search for groups of uppercase characters via / \ u \ { 2 , } [enter] Enter n until you reach the first acronym you want to mark up. Then record one of the following macros:

In XML mode (with xmledit): q a i < a c r o n y m > [esc] [right] d % / \ U [enter] P / \ u \ { 2 , } [enter] q
Without xmledit: q a i < a c r o n y m > [esc] / \ U [enter] i < / a c r o n y m > [esc] / \ u \ { 2 , } [enter] q

Now the cursor should be at the beginning of the next group of uppercase characters. If you want to mark it up (if it's an acronym that's not yet marked up) do @ a, otherwise do n to jump to the next group of uppercase letters. To repeat the last executed recorded action do @ @. You should get this:

<para>These <acronym>XSLT</acronym>s can be used for transforming
<acronym>DBX</acronym> to <acronym>XHTML</acronym>.</para>

Recording

Let's say you have some space-separated strings which you want to mark up as XHTML table:

1  2 3  4   5
2 4    6 8  10
3 6 9  12  15
4 8  12 16 20
5 10    15 20 25

One way is to record some sub-tasks:

get set

[mapleader] x g g > G : % s / $ / [space] . / [enter] g g O [ctrl-d] < t a b l e > > [esc] d d . G p 2 G

cells

q c i < t d > [esc] [right] d % t [space] p [right] d / \ S [enter] q 4 @ c

rows

q r x I < t r > [esc] [right] d % $ p + q

cells and rows

q a 5 @ c @ r q

and now comes the fun part

3 @ a

You should get this:

<table>
  <tr><td>1</td><td>2</td><td>3</td><td>4</td><td>5</td></tr>
  <tr><td>2</td><td>4</td><td>6</td><td>8</td><td>10</td></tr>
  <tr><td>3</td><td>6</td><td>9</td><td>12</td><td>15</td></tr>
  <tr><td>4</td><td>8</td><td>12</td><td>16</td><td>20</td></tr>
  <tr><td>5</td><td>10</td><td>15</td><td>20</td><td>25</td></tr>
</table>

Substitution

Using substitution commands should be faster than using recorded macros in XML mode. Hundreds of lines should be marked up instantly, thousands of lines still take less than a second on my box. With thousands of lines, undo can take seconds though; test and adjust your command line with a small number of representative lines before you run it on larger data sets.

The following examples show how semicolon-separated data which has been exported from a spreadsheet application can be marked up as XHTML table.

If the data is simple you can use simple commands. Here's some CSV data which has

no empty fields
no quoted fields (no record separators inside fields, no quotes inside quotes)
and no fields that span mutiple lines

1;2;3
2;4;6
3;6;9
4;8;12
5;10;15

Mark up the table cells via

:%s/\([^;]*\);\?/<td>\1<\/td>/g

:%!ruby -ne "print gsub(/([^;\n]+);?/,'<td>\1</td>')"

so that you get

<td>1</td><td>2</td><td>3</td>
<td>2</td><td>4</td><td>6</td>
<td>3</td><td>6</td><td>9</td>
<td>4</td><td>8</td><td>12</td>
<td>5</td><td>10</td><td>15</td>

Then mark up the table rows via

:%s/.\+/<tr>&<\/tr>/

Note

If you want to mark up just a range of lines (eg lines 29 through 33) replace :% with :29,33.

Alternatively you could chain the two steps:

:g/./ s/\([^;]*\);\?/<td>\1<\/td>/g | s/.\+/<tr>&<\/tr>/

<tr><td>1</td><td>2</td><td>3</td></tr>
<tr><td>2</td><td>4</td><td>6</td></tr>
<tr><td>3</td><td>6</td><td>9</td></tr>
<tr><td>4</td><td>8</td><td>12</td></tr>
<tr><td>5</td><td>10</td><td>15</td></tr>

Transforming CVS data to XHTML would typically not be done by manipulating a text editor buffer with commands but by running a little script on a file, but let's try another example just for the fun of it.

Spreadsheet

[spreadsheet screenshot]

Here's some more complex CSV data, as exported from a spreadsheet app:

1;2;3;4
"this field contains a line
break and a record ; separator";4;6;"line
break"
;8;12;
"quoted ""word""";16;;
;;"record ; separator";

Inserting a record separator at the end of each record which is at the end of the line simplifies things a bit:

:v/"[^";]\+$/ s/$/;/

To mark up all records in the buffer as table cells do

:%s/\("\(""\_.*""\|\_[^"]*\)"\|[^;]*\);/<td>\1<\/td>/g

Mark up table rows:

:%s/\(^\(<td>\_.\{-}<\/td>\)\{-}\)\(\n\|$\)/<tr>\1<\/tr>\3/g

All this is still based on some assumptions; before you process real data run tests with some lines representing your data and adjust the commands. Cleaning up is not simple either, here's a cheap way to remove the double quotes inside quotes

:%s/""/"/g

... and the quotes around the fields:

:g/./ s/"\(<\/td\)/\1/g | s/\(td>\)"/\1/g

A pretty cryptic way to do both at once: (one line)

:%!ruby -e "$nlt='[^<]'; print $stdin.read.gsub(Regexp.compile(
'(<td>)\"('+$nlt+'+)\"(<\/td>)')){($1+$2+$3).gsub(/\"\"/,'\"')}"

This is what you should get:

<tr><td>1</td><td>2</td><td>3</td><td>4</td></tr>
<tr><td>this field contains a line
break and a record ; separator</td><td>4</td><td>6</td><td>line
break</td></tr>
<tr><td></td><td>8</td><td>12</td><td></td></tr>
<tr><td>quoted "word"</td><td>16</td><td></td><td></td></tr>
<tr><td></td><td></td><td>record ; separator</td><td></td></tr>

Now you could insert dashes into the empty cells

:%s/\(<td>\)\(<\/td>\)/\1-\2/g

mark up the line breaks

:v/<\/tr>$/ s/$/<br \/>/

and then wrap the whole thing in an XHTML table.

Table in Browser

[browser screenshot]

Important

Before you publish web content please check it against the WCAG. For example there's a guideline explaining how to make tables accessible.

From Vim's command line, you can call any tool which is available on your system's path. So when there's a repetitive task like writing lots of similar tags, you could ask your favorite programming language to do it for you.

Start with this

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg version="1.1"
  xmlns="http://www.w3.org/2000/svg"
  xmlns:xlink="http://www.w3.org/1999/xlink"
  viewBox="-10 -60 520 120"
  width="400" height="100">
  <title>Snake Skin</title>
  <desc>A generated pattern.</desc>
  <defs>
    <circle id="c" r="2"/>
  </defs>


</svg>

then place your cursor where the content should go, eg via G 2 k. Enter the following in Vim's command line, all in one line. You can paste it (after having pressed :), eg via the middle mouse button (don't forget to delete the duplicate :).

:r !ruby -e "1.upto(2000){|i|puts '<use xlink:href=\'\#c\' x=\''+
(i/4).to_s+'\' y=\''+((Math.cos(i)*50)).to_s+'\'/>'}"

Now save the SVG and open it in a browser. You should see something like this:

Snake Pattern

A pattern consisting of black dots on white ground.

Here is the SVG itself: cos_pattern.svg.

Via search and replace, you can make hundreds of similar changes at once.

To go from

<span class="bold">foo</span>

<em>foo</em>

select the block(s) via

V}}

then enter the following in Vim's command line:

:'<,'>s/<span\_s\+class="bold">\(\_.\{-}\)<\/span>/<em>\1<\/em>/g

To rename all sect1 tags to section tags, you can ask Vim to execute something like

:%s/<\(\/\?\)sect1\(\_s*\)/<\1section\2/gc

To index some titles in a DBX document I use the following command (one line):

:%s/\(<title>\)\(.\+\)\(<\/title>\)\n\?\(\s*\)/\1\2<indexterm>\r\4
<primary>\2<\/primary>\r\4<\/indexterm>\3\r/gc

This only automates part of the task and is not really general but unlike XSLT it leaves CDATA sections intact (simply skip them) which make editing of code listings easier. If you get it to work for titles spanning multiple lines send me an email :).

Sometimes a task is expressed best using XPath. Simpler stuff can be done conveniently from Vim's command line, without having to create an XSLT file.

Rows

Here's an example showing a simple two-step approach to the common task of marking table rows with alternating attributes. This way the command lines become shorter and easier to write. Between each screen the command line is listed which describes the change between the previous and the following screen.

<?xml version="1.0"?>
<table>
  <tr>
    <td>foo</td>
  </tr>
  <tr>
    <td>foo</td>
  </tr>
  <tr>
    <td>foo</td>
  </tr>
  <tr>
    <td>foo</td>
  </tr>
</table>

:%!xmlstar ed -i /table/tr -t attr -n class -v odd

Here's what the option names in the above command line stand for:

ed: edit
-i: insert
-t: type
-n: name
-v: value

XMLStarlet also supports long options (eg --insert as alternative for -i), but the short versions save typing and space, which is especially useful when calling XMLStarlet from the shell and not from a script.

<?xml version="1.0"?>
<table>
  <tr class="odd">
    <td>foo</td>
  </tr>
  <tr class="odd">
    <td>foo</td>
  </tr>
  <tr class="odd">
    <td>foo</td>
  </tr>
  <tr class="odd">
    <td>foo</td>
  </tr>
</table>

:%!xmlstar ed -u "/table/tr[(position() mod 2)=0]/@class" -v even

<?xml version="1.0"?>
<table>
  <tr class="odd">
    <td>foo</td>
  </tr>
  <tr class="even">
    <td>foo</td>
  </tr>
  <tr class="odd">
    <td>foo</td>
  </tr>
  <tr class="even">
    <td>foo</td>
  </tr>
</table>

-u stands for or "update".