Last updated: 07-23-2007
The Technical Problem
Netbeans IDE has a dynamic code
completion feature, that enables user
to type in a few characters, which then allows the user to pick an
item from a
list of possible items in that context. This is very helpful and lets
user concentrate on his/her task, rather than figuring out what to type
next. This feature is available on Java files, HTML files, JSP files
and some custom XML files.
Status of code completion in XML
documents (Netbeans 5.5)
- A instance document that conforms to a schema, doesn't get any
code completion support.
- Code completion support is available on a few standard XML files,
e.g.
web.xml or build.xml.
- Current code completion support on XML files are based on DTD
files and not extensible.
- There is no generic way to extend this feature across all XML
files, based on their Schemas.
The code completion feature on XML
files, is very primitive
and outdated. The way it works today is, every time
the Schema changes for an XML, the module owners need to re-generate a
DTD for the new schema, register the new DTD and re-install the module.
The process is very in-efficient and needs manual intervention.
Proposed Solution - Schema aware code completion (Netbeans 6.0)
Typically, XML instance documents conform to some XML Schemas. The new
proposal suggests that the IDE reads the schema and offer code
completion based on the information model as obtained from the schema.
Advantages
- This generic framework can be used for any XML files that conform
to a schema.
- All advantages of XML Schemas over DTDs apply.
- The completion result will be live and more accurate.
- The completion result will be namespace aware.
- If a schema changes, end-user needs to update his/her XML
instance and gets live result based on the new schema.
- User can get code completion help for XML Schema Wildcards.
- Namespace declaration can be inserted. This is similar to "Fix
Import" feature in Netbeans for Java files.
- Reuse. The core module can be used by others to provide
completion for documents that do not explicitly declare conformance to
a schema such as WSDL or ANT's build.xml. Hence less bugs, less
maintenance.
- New features in schema aware code completion can be seen
uniformly across the IDE. For example, when we provide attribute value
completion, it'll be availabe to all XML files (that use schema aware
code completion) for free, that is, without any code changes.
Detailed Description
schemaLocation and noNamespaceSchemaLocation
An XML instance document that
conforms to a schema must mention about the schema with the help of
schemaLocation
or
noNamespaceSchemaLocation
attribute at the root element of the document. The
schemaLocation
and
noNamespaceSchemaLocation
attributes can be used
in a document to provide hints as to the physical location of schema
documents which may be used
for assessment and validation.
The value of
schemaLocation consists
of one or
more pairs of URI references, separated by white space. The first
member of each pair is a namespace name, and the second member of the
pair is a hint describing where to find an appropriate schema document
for that namespace. Where as, the value of
noNamespaceSchemaLocation
consists of one or more schemas separated by white space. Typically,
schemaLocation
is used for schemas with a target namespaces and
noNamespaceSchemaLocation
is used for schemas with no target namespace.
Here is an example:
<po:purchaseOrder
xmlns:po="http://xml.netbeans.org/schema/PO"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xml.netbeans.org/schema/PO
http://xml.netbeans.org/schema/PO.xsd">
.....
</po:purchaseOrder>
|
Mechanics of Schema aware Code
Completion
As a general rule, if the root
elements provide necessary hints as to
where to find the schemas, the IDE will be able to offer code
completion as per the schema(s). The IDE searches for the schemas in
the following order:
- First it tries to find the schema
in the local file system
- If not found, it looks up the runtime
catalog
- As a last resort, it fetches schemas from the internet, caches
them locally and uses them to provide CC.
- If not found, there will not be any CC.
XML Schemas define the structure of
an instance documents. A schema can have one or more top level
elements. An instance document can be created with any one top level
elements from the schema. In the example above, we have used
purchaseOrder element as the
document's root element. In order to get code completion in an instance
document, one of the following conditions must be met:
- If the root element is namespace-qualified, the namespace of the
document must match the target namespace
of one schema specified in the schemaLocation attribute. The schema
that matches, is said to be the primary schema. The primary schema must
have a root element with the same name as
the root element of the instance document.
- If the root element is not namespace-qualified, the schema
without a target namespace having a root element same as the root
element of the document becomes the primary schema.
If one of the above conditions are met, when user types-in a start
tag "<" in the document, the tool will offer code completion. Code
completion can also be invoked at
various level by pressing Ctrl and Space key togther. The schema aware
code completion feature provides two types of completion. Element
completion and Attribute completion.
Element Code Completion
When the user types in a start tag inside an existing element, you'll
see a list of child elements for that parent element. For example when
you're inside purchaseOrder element and you start typing in the start
tag "<", you'll see all children elements of purchaseOrder
Attribute Code Completion
When the user types in a 'Space'
character inside the element tag,
you'll
see a list of attributues for that element. For example when you're
inside the shipTo element and you'll see all attributes for shipTo as
follows:
Algorithm for computing completion items at a given context
At any given context, that is, at any
cursor location in an XML document, the following algorithm is applied
to get a list of possible completion items. The algorithm is carried
out in three steps.
- Find a schema that can be queried.
- Find the root element of the document.
- Look for attributes schemaLocation
or noNamespaceSchemaLocation.
If none found, return with an empty list *.
- If the root element is namespace-qualified, the namespace of
the
document must match the target namespace
of one schema specified in the schemaLocation
attribute. The schema
that matches, is said to be the primary schema. The primary schema must
have a root element with the same name as
the root element of the instance document.
- If the root element is not namespace-qualified, the schema
without a target namespace having a root element same as the root
element of the document becomes the primary schema.
- Check if element or attribute
completion?
- If a start-tag '<'
is entered, it is an element completion. If a space is entered inside
an elemement tag, it is an attribute completion.
- Get the path from root element
- Traverse the DOM tree starting from root of the tree to the
cursor location and create a path (list) of QNames.
- Query the abstract instance
model
- Query the model, to find child elements or attributes for the
path obtained above as a list of completion items.
* Tool may still be able to offer code completion for some instance
documents, that do not explicitly declare the comformance to a schema
either through
schemaLocation
or
noNamespaceSchemaLocation.
For example, WSDL or
ANT's
build.xml file. In order to cater to such documents, the module offers
a plugin mechanism, in which external components can pass in schema(s)
that will be queried.
This feature will allow all existing instance document that do provide
completion based on DTDs to switch to Schemas. In essence, it'll allow
backward compatibility.
Wildcard substitution
XML Schema allows
Wildcards.
When using wildcards (
xsd:any and
xsd:anyAttribute)
it is possible to constrian the content with the help of namespace.
Both
xsd:any
and
xsd:anyAttribute
come with an optional namespace attribute that may contain any of the
values shown in
Namespace
column in the Table below. This makes it possible to be very specific
about where the wildcard replacement content comes from.
The code completion feature, substitutes these wildcards as follows:
If namespace is
|
Substitute
with
|
| ##any |
Any element from any namespace
|
| ##other |
Any element from other
namespaces other than the targetNamespace. |
| ##targetNamespace |
Any element from targetNamespace |
| ##local |
Any unqualified (no namespace) |
List of URIs
|
Elements from the specified
namespaces
|
Lets take a look at an example:
<a:RootA
xmlns:a="http://xml.netbeans.org/schema/A"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xml.netbeans.org/schema/A A.xsd
http://xml.netbeans.org/schema/B B.xsd
http://xml.netbeans.org/schema/C C.xsd">
< <= current cursor
position
</a:RootA> |
In this example, RootA is
one of the root elements defined in schema A.xsd. If RootA had a xsd:any child element, then at
the cursor position you would see items appearing from various
namespaces as per substitution rule above. Same applies for xsd:anyAttribute.
Insertion of Namespace Declaration
When user selects item from other
namespaces (applicable only in cases
of wildcard), the IDE will automatically insert a namespace
declaration for that element. For example, if there was an xsd:any
element, code completion would substitue the xsd:any with
a list of
valid elements. Some of these elements may come from namespaces that
may not have been declared in the document, in which case, the tool
will automatically insert a namespace declaration.
The tool first tries a prefix "ns1"
and if not found uses it. If found,
it tries "ns2", "ns3" and so on. The declaration
looks like this:
xmlns:ns1="tns"
Where tns is target namespace for the
selected element.
Conclusion
The idea behind schema aware code completion is to offer code
completion in instance documents more efficiently and accurately. At
the same time, we also wanted to fix the DTD based code completion for
some legacy documents