Visitor Design Pattern in Delphi

This session consists of the development of a small application to read and pretty-print XML and CSV files. Along the way, we explain and demonstrate the use of the following patterns: State, Interpreter, Visitor, Strategy, Command, Memento, and Facade.

Represent an operation to be performed on the elements of an object structure. Visitor lets you define a new operation without changing the classes of the elements on which it operates.

What the Visitor pattern does is move the operations on the tree (or other object structure) from the nodes of the tree to another class. This class needs enough information in the interface of each node to perform the operation, so sometimes this can mean more public properties and methods than you might like.

Visitors have been described in Delphi before so we won't dwell overly on them. Essentially, what you need to do is declare a Visitor class, which declares a Visit operation for each node type in the object structure. Here's the base visitor class for the XML interpreter:

TXmlInterpreterVisitor = class(TObject)
private
protected
  procedure Visit(Exp : TXmlStartTag); overload; virtual;
  procedure Visit(Exp : TXmlEndTag); overload; virtual;
  procedure Visit(Exp : TXmlNode); overload; virtual;
  procedure Visit(Exp : TXmlTagList); overload; virtual;
  procedure Visit(Exp : TXmlProlog); overload; virtual;
  procedure Visit(Exp : TXmlDoc); overload; virtual;
public
end;

Visit methods are virtual so that visitor descendants can choose which methods to implement - it may not be necessary to implement them all. I've used function overloading as the signature is of necessity different for each, but this is not essential, and you could use more explicit names e.g. VisitXmlStartTag. The advantage of the function overloading is that it makes the code a little easier to follow, in my opinion.

In the base expression class we need to define an abstract Accept method. As for search and replace, the Delphi compiler forces us to implement the method in all the expression classes. You can see from the TXmlDoc implementation that the code is practically identical to the SearchAndReplace method shown earlier:

procedure TXmlDoc.Accept(Visitor : TxmlInterpreterVisitor);
begin
  Visitor.Visit(Self);
 
  if Assigned(Prolog) then begin
    Prolog.Accept(Visitor);
  end;

  if Assigned(TagList) then begin
    TagList.Accept(Visitor);
  end;
end;

The only difference is that we call the Visit method of the Visitor, which is passed as a parameter. Actually, I cheated a bit in one other method too, in order to make pretty printing work better, but if I skip lightly over it you'll never notice.

Once the visitor code is in place in the syntax tree code, adding new visitors does not require any more changes to that code, only the declaration of new visitor classes.

A concrete visitor class is defined in XmlInterpreterVisitors.pas, showing how to implement a pretty printer. The parser is quite capable of taking a very badly formatted XML file and creating the syntax tree. We will regenerate the document all nicely laid out, like the example XML we saw earlier.

The class definition is:

TXmlPrettyPrinter = class(TXmlInterpreterVisitor)
private
  FList   : TStringList;
  FIndent : Integer;

  function GetText : string;
protected
  procedure AddString(AStr : string);
public
  constructor Create;
  destructor  Destroy; override;
 
  procedure Visit(Exp : TXmlStartTag); override;
  procedure Visit(Exp : TXmlEndTag); override;
  procedure Visit(Exp : TXmlNode); override;
  procedure Visit(Exp : TXmlProlog); override;
  procedure Clear;
 
  property Text : string read GetText;
end;

As you can see, we don't need to implement a visitor for every expression class, as not all of them require anything to be printed. In fact, it's only those that have terminal sub-expressions that will get anything printed. That is, the start and end tags, the data in a node, and the prolog.

We'll keep track of the indent at each point, and on each new line we will prepend the correct number of spaces. I've decided to collect the newly formatted text in a TStringList as it will keep track of new lines for me. The GetText function just accesses the Text property of the list.

Some of the Visit methods are:

procedure TXmlPrettyPrinter.Visit(Exp : TXmlStartTag);
begin
  AddString('<' + Exp.TagName + '>');
  Inc(FIndent,IndentAmount);
end;

procedure TXmlPrettyPrinter.Visit(Exp : TXmlEndTag);
begin
  Dec(FIndent,IndentAmount);
  AddString('</' + Exp.TagName + '>');
  AddString('');
end;

procedure TXmlPrettyPrinter.Visit(Exp : TXmlNode);
begin
  if Exp.Data = '' then begin
    // Print an empty tag
    AddString('<' + Exp.StartTag.TagName + '/>');
  end else begin
    AddString(Format('<%s>%s</%s>',
                     [Exp.StartTag.TagName,
                      Exp.Data,
                      Exp.EndTag.TagName]));
  end;
end;

On finding a start tag, we add the tag to the list, then increment the indent. On an end tag we do the reverse, decrementing first to bring it back into line with the start tag. We also add a blank line after the tag. As it happens, this will only be the case for tags surrounding XML collections, as the place I cheated is on individual data nodes. I arranged the TXmlNode.Accept routine so that if there is no TagList, the start and end tags are not visited, but are left to be dealt with in the node visitor method, as shown above. This is a cheat purely to let me print the tags and data on one line more easily.

Adding a new operation on the syntax tree is easy, we just add a new visitor (we could reimplement the search and replace this way, for instance). Now related operations are all in one class, and unrelated ones would be in different classes, rather than the Interpreter classes containing many unrelated operations in each class.

Visitor is a really nice pattern, and quite often useful. It is similar to Iterator in that it is used to traverse object structures, but Visitor also works when there is no common parent for the structure items. However, it does have some disadvantages. We mentioned breaking encapsulation earlier, but it can also make life difficult if you often need to add new elements to your structure. For instance, if we added new grammar rules, then we would need to add new methods to the base visitor, and check every concrete visitor to see if it also needed to reflect the changes.

So we can now read both XML and CSV files. It's time to start feeding them documents.

Code examples