Schema aware code completion on XML documents

Authors: Samaresh Panda
Last updated: 07-23-2007

The Technical Problem

Netbeans IDE has a dynamic code completion feature, that enables user to type in a few characters, which then allows the user to pick an item from a list of possible items in that context. This is very helpful and lets user concentrate on his/her task, rather than figuring out what to type next. This feature is available on Java files, HTML files, JSP files and some custom XML files.

Status of code completion in XML documents (Netbeans 5.5)

  • A instance document that conforms to a schema, doesn't get any code completion support.
  • Code completion support is available on a few standard XML files, e.g. web.xml or build.xml.
  • Current code completion support on XML files are based on DTD files and not extensible.
  • There is no generic way to extend this feature across all XML files, based on their Schemas.
The code completion feature on XML files, is very primitive and outdated. The way it works today is, every time the Schema changes for an XML, the module owners need to re-generate a DTD for the new schema, register the new DTD and re-install the module. The process is very in-efficient and needs manual intervention.

Proposed Solution - Schema aware code completion (Netbeans 6.0)

Typically, XML instance documents conform to some XML Schemas. The new proposal suggests that the IDE reads the schema and offer code completion based on the information model as obtained from the schema.

Advantages

  • This generic framework can be used for any XML files that conform to a schema.
  • All advantages of XML Schemas over DTDs apply.
  • The completion result will be live and more accurate.
  • The completion result will be namespace aware.
  • If a schema changes, end-user needs to update his/her XML instance and gets live result based on the new schema.
  • User can get code completion help for XML Schema Wildcards.
  • Namespace declaration can be inserted. This is similar to "Fix Import" feature in Netbeans for Java files.
  • Reuse. The core module can be used by others to provide completion for documents that do not explicitly declare conformance to a schema such as WSDL or ANT's build.xml. Hence less bugs, less maintenance.
  • New features in schema aware code completion can be seen uniformly across the IDE. For example, when we provide attribute value completion, it'll be availabe to all XML files (that use schema aware code completion) for free, that is, without any code changes.

Detailed Description

schemaLocation and noNamespaceSchemaLocation

An XML instance document that conforms to a schema must mention about the schema with the help of schemaLocation or noNamespaceSchemaLocation attribute at the root element of the document.  The schemaLocation and noNamespaceSchemaLocation attributes can be used in a document to provide hints as to the physical location of schema documents which may be used for assessment and validation.

The value of schemaLocation consists of one or more pairs of URI references, separated by white space. The first member of each pair is a namespace name, and the second member of the pair is a hint describing where to find an appropriate schema document for that namespace. Where as, the value of noNamespaceSchemaLocation consists of one or more schemas separated by white space. Typically, schemaLocation is used for schemas with a target namespaces and noNamespaceSchemaLocation is used for schemas with no target namespace.

 Here is an example:
<po:purchaseOrder
  xmlns:po="http://xml.netbeans.org/schema/PO"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://xml.netbeans.org/schema/PO http://xml.netbeans.org/schema/PO.xsd">
      .....
</po:purchaseOrder>

Mechanics of Schema aware Code Completion

As a general rule, if the root elements provide necessary hints as to where to find the schemas, the IDE will be able to offer code completion as per the schema(s). The IDE searches for the schemas in the following order:

  • First it tries to find the schema in the local file system
  • If not found, it looks up the runtime catalog
  • As a last resort, it fetches schemas from the internet, caches them locally and uses them to provide CC.
  • If not found, there will not be any CC.

XML Schemas define the structure of an instance documents. A schema can have one or more top level elements. An instance document can be created with any one top level elements from the schema. In the example above, we have used purchaseOrder element as the document's root element. In order to get code completion in an instance document, one of the following conditions must be met:

  • If the root element is namespace-qualified, the namespace of the document must match the target namespace of one schema specified in the schemaLocation attribute. The schema that matches, is said to be the primary schema. The primary schema must have a root element with the same name as the root element of the instance document.
  • If the root element is not namespace-qualified, the schema without a target namespace having a root element same as the root element of the document becomes the primary schema.
If one of the above conditions are met, when user types-in a start tag "<" in the document, the tool will offer code completion. Code completion can also be invoked at various level by pressing Ctrl and Space key togther. The schema aware code completion feature provides two types of completion. Element completion and Attribute completion.

Element Code Completion

When the user types in a start tag inside an existing element, you'll see a list of child elements for that parent element. For example when you're inside purchaseOrder element and you start typing in the start tag "<", you'll see all children elements of purchaseOrder

Element code completion

Attribute Code Completion

When the user types in a 'Space' character inside the element tag, you'll see a list of attributues for that element. For example when you're inside the shipTo element and you'll see all attributes for shipTo as follows:
Attribute code completion


Algorithm for computing completion items at a given context

At any given context, that is, at any cursor location in an XML document, the following algorithm is applied to get a list of possible completion items. The algorithm is carried out in three steps.
  • Find a schema that can be queried.
  • Find the root element of the document.
  • Look for attributes schemaLocation or noNamespaceSchemaLocation. If none found, return with an empty list *.
  • If the root element is namespace-qualified, the namespace of the document must match the target namespace of one schema specified in the schemaLocation attribute. The schema that matches, is said to be the primary schema. The primary schema must have a root element with the same name as the root element of the instance document.
  • If the root element is not namespace-qualified, the schema without a target namespace having a root element same as the root element of the document becomes the primary schema.
  • Check if element or attribute completion?
  • If a start-tag '<' is entered, it is an element completion. If a space is entered inside an elemement tag, it is an attribute completion.
  • Get the path from root element
  • Traverse the DOM tree starting from root of the tree to the cursor location and create a path (list) of QNames.
  • Query the abstract instance model
  • Query the model, to find child elements or attributes for the path obtained above as a list of completion items.

* Tool may still be able to offer code completion for some instance documents, that do not explicitly declare the comformance to a schema either through schemaLocation or noNamespaceSchemaLocation. For example, WSDL or ANT's build.xml file. In order to cater to such documents, the module offers a plugin mechanism, in which external components can pass in schema(s) that will be queried.

This feature will allow all existing instance document that do provide completion based on DTDs to switch to Schemas. In essence, it'll allow backward compatibility.

Wildcard substitution

XML Schema allows Wildcards. When using wildcards (xsd:any and xsd:anyAttribute) it is possible to constrian the content with the help of namespace. Both xsd:any and xsd:anyAttribute come with an optional namespace attribute that may contain any of the values shown in Namespace column in the Table below. This makes it possible to be very specific about where the wildcard replacement content comes from.

The code completion feature, substitutes these wildcards as follows:
If namespace is
Substitute with
##any Any element from any namespace
##other Any element from other namespaces other than the targetNamespace.
##targetNamespace Any element from targetNamespace
##local Any unqualified (no namespace)
List of URIs
Elements from the specified namespaces


Lets take a look at an example:
<a:RootA xmlns:a="http://xml.netbeans.org/schema/A"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://xml.netbeans.org/schema/A A.xsd
  http://xml.netbeans.org/schema/B B.xsd
  http://xml.netbeans.org/schema/C C.xsd">

  < <= current cursor position

</a:RootA>

In this example, RootA is one of the root elements defined in schema A.xsd. If RootA had a xsd:any child element, then at the cursor position you would see items appearing from various namespaces as per substitution rule above. Same applies for xsd:anyAttribute.

Insertion of Namespace Declaration

When user selects item from other namespaces (applicable only in cases of wildcard), the IDE will automatically insert a namespace declaration for that element. For example, if there was an xsd:any element, code completion would substitue the xsd:any with a list of valid elements. Some of these elements may come from namespaces that may not have been declared in the document, in which case, the tool will automatically insert a namespace declaration.

The tool first tries a prefix "ns1" and if not found uses it. If found, it tries "ns2", "ns3" and so on. The declaration looks like this:

xmlns:ns1="tns"

Where tns is target namespace for the selected element.

Conclusion

The idea behind schema aware code completion is to offer code completion in instance documents more efficiently and accurately. At the same time, we also wanted to fix the DTD based code completion for some legacy documents



Project Features

About this Project

XML was started in November 2009, is owned by dstrupl, and has 58 members.
By use of this website, you agree to the NetBeans Policies and Terms of Use (revision 20160708.bf2ac18). © 2014, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo
 
 
Close
loading
Please Confirm
Close