Playing Dance Dance Revolution with a tail

I didn’t coax her into this.  As soon as I turn my back, she’s there waiting to start.

Slow XPath evaluation for large XML documents in Java 1.5

Feeding a large XML document through JAXP in J2SE 5 (1.5) for XPath processing has some unexpected performance ramifications: the processing time is proportional to the size of the owning Document rather than the size of the sub tree being queried.

Using java.xml.xpath.XPath.evaluate() (or XPathExpression) has a noticeable overhead cost proportional to the size of the context Node's owning Document.  This is true even if the XPath expression is a simple immediate-child element reference.  i.e. Evaluating an XPath expression for a given Node seems to “visit” the  parent, siblings, cousins, etc. in the Document.

My XML document looks something like this:
<request>
  <operation1>
    <param1>value</param1>
    ...
    <param2>value</param2>
  </operation1>
  <operation2>
  ...
  </operation2>
  ...
</request>

There are about a thousand operation elements, and each has about twenty param elements.  There are a few additional descendants that I’m not listing.

The Java code grabs all the operation Nodes with a /request/child::* expression.  Iterating over each operation, it consults a Map to find which Command object to pass the Node to.  Each Command object extracts the param values by evaluating XPath expressions relative to the operation element.

Some black-box sleuthing by someone else revealed that sending the operations as a thousand separate XML documents was about ten times faster than sending the operations in one Document.  After some time with OptimizeIt and pondering it over, I have a hypothesis.

As the XPath language is expressive enough to reference parents and siblings relative to the context node, the XPath implementation has to do some preliminary processing of the entire document tree for every evaluation.  Thus each XPath evaluation to obtain the param element from the operation element still needs to examine the entire Document in some fashion.  Getting M*N param elements, each time incurring the cost of processing an N+M node tree, would explain the observed behaviour.

I don’t know how I’d go about verifying that hypothesis, short of diving in the source code.

BTW, caching the XPath expression compilation using XPathExpression doesn’t change anything.

I put in a one line hack to detach each child node from the Document before passing it along for processing.  I never use any parent or sibling references, nor do I forsee it.  It brought the performance of the single large Document back into line with that of many small Documents.

It’s rather unusual behaviour.  I wonder if the JAXP API, in its generality, hides Xalan’s API (J2SE5 uses Xalan internally) for managing and caching these Documents for XPath purposes.  I remember working with Xalan in the Java 1.3/1.4 days, and there was something called a DTM for faster processing of XML via XPath.  IIRC, the DTM created an integer representation of the Document tree and its Nodes.  One worked directly with the DTMManager to cache and reuse this numerical representation.

Scaramouche: a New Year’s Eve dinner

Scaramouche is located in a small cul-de-sac, oddly in a very residential neighbourhood.  Even the building above it is a condominium.  Inside, the decor is classic fine dining.  The view of Toronto is lovely; one envies the view the local residents must have.  New Year’s Eve found me here with a few foodie friends.
Scaramouche - interior
Scaramouch - view of Toronto

Of course, Scaramouche is about the sights at the table rather than outside.  We opened with an Amuse Bouche for each of us:
Amuse Bouche
The fried lotus roots slices were crispy, holding a bit of salmon and tomato-based salsa between.  Looks like a butterfly from here, doesn’t it?

The table had a variety of appetizers.  Grilled Calamari Salad:
Grilled Calamari Salad
Mint, argula, red onions, celery leaves, fresh red chili, black olives, tossed with salsa verde.

The Serrano Ham:
Serrano Ham
Roasted artichokes, sweet peppers, parsley, Cerignola olives, preserved lemon, Romesco sauce, shaved Toscano cheese.

Six Oysters on the Half Shell:
Six Oysters on the Half Shell

And a few rounds of Butternut Squash soup:
Butternut squash soup

The duck breast special:
duck breast special
The duck was very good, according to a more exacting palate than my own.

The entrees were well decorated.  My compatriots were impatient for me to finish with the pictures so that they could dive in!

The Roasted Squab, Quail and Crispy Duck Confit.  In addition, pan-seared foie gras in a truffled mushroom and winter vegetable pot-au-feu.
Roasted Squab, Quail, and Crispy Duck Confit

Fresh Venison Loin, roasted in smoked bacon with a saute of gnocchi a la parisienne, leeks, porcini mushrooms and sweet corn, red wine jus with English mustard.
Venison loin

Fresh Seafood: lobster, sea scallop, calamari, fresh fish, wilted spinach in a saffron citrus and herb nage.
Fresh seafood dish

Roasted Tournedo of Ontario Veal Tenderloin; braised veal, root vegetable, truffled potato tian, wilted greens, sauteed mushrooms, porcini foam.
Roasted Tournedo of Ontario Veal Tenderloin

We all happily sampled each others’ dishes.  Mostly, I remember the porcini foam of the Veal Tenderloin dish.  That was my primary dish, and I love mushrooms.  I would have licked it off the plate if I thought I could get away with it (in regards to both the restaurant and my dining companions!).

The highlight, of course, are always the desserts.  A pretty sight and a good taste.

Coffee Petit Pot de Creme, with biscotti and tuile.

Triple Chocolate Tart, with creme anglaise, chocolate sauce and whipped cream.

Bosc Pears Poached in Raspberry Vanilla Syrup, with pear cake and whipped pastry cream.

Coconut Cream Pie, with white chocolate shavings and dark chocolate sauce.

Sweet Wine and Olive Oil Cake, with roasted plums, honey Mascarpone, red wine syrup and toasted pine nuts.

Apple and Almond Bake, with caramelized ginger sauce and whipped cream.

We did a variation of musical chairs with these desserts, sampling a bit before passing to the left.  It was quite a treat to try so many different desserts!  I’m told the Pot de Creme was unusually good (I’ve never had one before).  I do recall a general dislike for one of either the Apple Bake, Olive Oil Cake, or the Bosch Pears, and the other two being not too memorable.

The Coconut Cream Pie, Scaramouche’s signature dessert, certainly looked the part.  Unfortunately, since I don’t like coconut, I can’t give a fair commentary on it.  I did try to like it!

The Triple Chocolate Tart is an interesting one.  I can say with certainty that it is heavy in dark, bitter chocolate!  I’m allergic to chocolate (I cry, too ^_^), but I insisted on sampling some.  My sample was probably about the size of a small pea, and enough chocolate was in there to immediately trigger my usual reaction (watering eyes and sneezing).

Overall, very nice ambience, pretty dishes, good tastes if you order right, reasonably sized, and  expensive.  Not quite what I’d want for the money I spent (Give me a bowl of porcini foam!), but the novelty and looks bring it the rest of the way.

Soups (12), Serrano Ham (19), Oysters (19), Calamari Salad (19), Duck Breast (19), Pot au Feu (46), Venison Loin (42), Seafood (46), Veal Tenderloin (43), Bosc Pears (12), Pot de Creme (11), Coconut Pie (12), Olive Oil Cake (12), Chocolate Tart (12), Apple Bake (12).

  • Go to Random Post

  • Recent Posts

  • Recent Comments

    Ressie Nuncio on Getting Dell 1700 laser printe…
    GJ on Slow XPath evaluation for larg…
    Osvaldo on Slow XPath evaluation for larg…
    GJ on Slow XPath evaluation for larg…
    GJ on Slow XPath evaluation for larg…