Schema aware code completion on XML documents
Authors: Samaresh Panda
Last updated: 07-23-2007
Status of code completion in XML
documents (Netbeans 5.5)
schemaLocation and noNamespaceSchemaLocation
An XML instance document that
conforms to a schema must mention about the schema with the help of schemaLocation
or noNamespaceSchemaLocation
attribute at the root element of the document. The
The value of
Here is an example:
Mechanics of Schema aware Code
Completion

* Tool may still be able to offer code completion for some instance
documents, that do not explicitly declare the comformance to a schema
either through schemaLocation
or noNamespaceSchemaLocation.
For example, WSDL or ANT's
build.xml file. In order to cater to such documents, the module offers
a plugin mechanism, in which external components can pass in schema(s)
that will be queried.
This feature will allow all existing instance document that do provide completion based on DTDs to switch to Schemas. In essence, it'll allow backward compatibility.
Lets take a look at an example:
In this example, RootA is one of the root elements defined in schema A.xsd. If RootA had a xsd:any child element, then at the cursor position you would see items appearing from various namespaces as per substitution rule above. Same applies for xsd:anyAttribute.
The Technical Problem
Netbeans IDE has a dynamic code
completion feature, that enables user
to type in a few characters, which then allows the user to pick an
item from a
list of possible items in that context. This is very helpful and lets
user concentrate on his/her task, rather than figuring out what to type
next. This feature is available on Java files, HTML files, JSP files
and some custom XML files.
Status of code completion in XML
documents (Netbeans 5.5)
- A instance document that conforms to a schema, doesn't get any
code completion support.
- Code completion support is available on a few standard XML files, e.g. web.xml or build.xml.
- Current code completion support on XML files are based on DTD files and not extensible.
- There is no generic way to extend this feature across all XML files, based on their Schemas.
The code completion feature on XML
files, is very primitive
and outdated. The way it works today is, every time
the Schema changes for an XML, the module owners need to re-generate a
DTD for the new schema, register the new DTD and re-install the module.
The process is very in-efficient and needs manual intervention.
Proposed Solution - Schema aware code completion (Netbeans 6.0)
Typically, XML instance documents conform to some XML Schemas. The new proposal suggests that the IDE reads the schema and offer code completion based on the information model as obtained from the schema.Advantages
- This generic framework can be used for any XML files that conform to a schema.
- All advantages of XML Schemas over DTDs apply.
- The completion result will be live and more accurate.
- The completion result will be namespace aware.
- If a schema changes, end-user needs to update his/her XML instance and gets live result based on the new schema.
- User can get code completion help for XML Schema Wildcards.
- Namespace declaration can be inserted. This is similar to "Fix Import" feature in Netbeans for Java files.
- Reuse. The core module can be used by others to provide completion for documents that do not explicitly declare conformance to a schema such as WSDL or ANT's build.xml. Hence less bugs, less maintenance.
- New features in schema aware code completion can be seen
uniformly across the IDE. For example, when we provide attribute value
completion, it'll be availabe to all XML files (that use schema aware
code completion) for free, that is, without any code changes.
Detailed Description
schemaLocation and noNamespaceSchemaLocation
An XML instance document that
conforms to a schema must mention about the schema with the help of schemaLocation
or noNamespaceSchemaLocation
attribute at the root element of the document. The schemaLocation
and noNamespaceSchemaLocation
attributes can be used
in a document to provide hints as to the physical location of schema
documents which may be used
for assessment and validation.The value of
schemaLocation
consists
of one or
more pairs of URI references, separated by white space. The first
member of each pair is a namespace name, and the second member of the
pair is a hint describing where to find an appropriate schema document
for that namespace. Where as, the value of noNamespaceSchemaLocation
consists of one or more schemas separated by white space. Typically, schemaLocation
is used for schemas with a target namespaces and noNamespaceSchemaLocation
is used for schemas with no target namespace.Here is an example:
<po:purchaseOrder xmlns:po="http://xml.netbeans.org/schema/PO" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xml.netbeans.org/schema/PO http://xml.netbeans.org/schema/PO.xsd"> ..... </po:purchaseOrder> |
Mechanics of Schema aware Code
Completion
As a general rule, if the root
elements provide necessary hints as to
where to find the schemas, the IDE will be able to offer code
completion as per the schema(s). The IDE searches for the schemas in
the following order:
XML Schemas define the structure of
an instance documents. A schema can have one or more top level
elements. An instance document can be created with any one top level
elements from the schema. In the example above, we have used purchaseOrder element as the
document's root element. In order to get code completion in an instance
document, one of the following conditions must be met:- First it tries to find the schema in the local file system
- If not found, it looks up the runtime
catalog
- As a last resort, it fetches schemas from the internet, caches them locally and uses them to provide CC.
- If not found, there will not be any CC.
- If the root element is namespace-qualified, the namespace of the document must match the target namespace of one schema specified in the schemaLocation attribute. The schema that matches, is said to be the primary schema. The primary schema must have a root element with the same name as the root element of the instance document.
- If the root element is not namespace-qualified, the schema
without a target namespace having a root element same as the root
element of the document becomes the primary schema.
Element Code Completion
When the user types in a start tag inside an existing element, you'll see a list of child elements for that parent element. For example when you're inside purchaseOrder element and you start typing in the start tag "<", you'll see all children elements of purchaseOrder
Attribute Code Completion
When the user types in a 'Space'
character inside the element tag,
you'll
see a list of attributues for that element. For example when you're
inside the shipTo element and you'll see all attributes for shipTo as
follows:


Algorithm for computing completion items at a given context
At any given context, that is, at any
cursor location in an XML document, the following algorithm is applied
to get a list of possible completion items. The algorithm is carried
out in three steps.
- Find a schema that can be queried.
- Find the root element of the document.
- Look for attributes schemaLocation
or noNamespaceSchemaLocation.
If none found, return with an empty list *.
- If the root element is namespace-qualified, the namespace of the document must match the target namespace of one schema specified in the schemaLocation attribute. The schema that matches, is said to be the primary schema. The primary schema must have a root element with the same name as the root element of the instance document.
- If the root element is not namespace-qualified, the schema without a target namespace having a root element same as the root element of the document becomes the primary schema.
- Check if element or attribute completion?
- If a start-tag '<' is entered, it is an element completion. If a space is entered inside an elemement tag, it is an attribute completion.
- Get the path from root element
- Traverse the DOM tree starting from root of the tree to the cursor location and create a path (list) of QNames.
- Query the abstract instance
model
- Query the model, to find child elements or attributes for the path obtained above as a list of completion items.
This feature will allow all existing instance document that do provide completion based on DTDs to switch to Schemas. In essence, it'll allow backward compatibility.
Wildcard substitution
XML Schema allows Wildcards.
When using wildcards (xsd:any and xsd:anyAttribute)
it is possible to constrian the content with the help of namespace.
Both xsd:any
and xsd:anyAttribute
come with an optional namespace attribute that may contain any of the
values shown in Namespace
column in the Table below. This makes it possible to be very specific
about where the wildcard replacement content comes from.
The code completion feature, substitutes these wildcards as follows:
The code completion feature, substitutes these wildcards as follows:
If namespace is |
Substitute
with |
##any | Any element from any namespace |
##other | Any element from other namespaces other than the targetNamespace. |
##targetNamespace | Any element from targetNamespace |
##local | Any unqualified (no namespace) |
List of URIs |
Elements from the specified
namespaces |
Lets take a look at an example:
<a:RootA
xmlns:a="http://xml.netbeans.org/schema/A" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xml.netbeans.org/schema/A A.xsd http://xml.netbeans.org/schema/B B.xsd http://xml.netbeans.org/schema/C C.xsd"> < <= current cursor position </a:RootA> |
In this example, RootA is one of the root elements defined in schema A.xsd. If RootA had a xsd:any child element, then at the cursor position you would see items appearing from various namespaces as per substitution rule above. Same applies for xsd:anyAttribute.
Insertion of Namespace Declaration
When user selects item from other
namespaces (applicable only in cases
of wildcard), the IDE will automatically insert a namespace
declaration for that element. For example, if there was an xsd:any
element, code completion would substitue the xsd:any with
a list of
valid elements. Some of these elements may come from namespaces that
may not have been declared in the document, in which case, the tool
will automatically insert a namespace declaration.
The tool first tries a prefix "ns1" and if not found uses it. If found, it tries "ns2", "ns3" and so on. The declaration looks like this:
The tool first tries a prefix "ns1" and if not found uses it. If found, it tries "ns2", "ns3" and so on. The declaration looks like this:
xmlns:ns1="tns"
Where tns is target namespace for the
selected element.