We study the problem of querying XML data sources that accept only a
limited set of queries, such as sources accessible by Web services which can
implement very large (potentially infinite) families of XPath queries. To
compactly specify such families of queries we adopt the Query Set
Specifications~\cite{PetropoulosDP03}, a formalism close to context-free
grammars. We say that query $Q$ is {\em expressible} by the specification
\calP\ if it is equivalent to some expansion of \calP. $Q$ is {\em supported}
by \calP\ if it has an equivalent rewriting using some finite set of \calP's
expansions. We study the complexity of expressibility and support and identify
large classes of XPath queries for which there are efficient (PTIME)
algorithms. Our study considers both the case in which the XML nodes in the
results of the queries lose their original identity and the one in which the
source exposes persistent node ids.
Pre-2018 CSE ID: CS2009-0941