XML Dependency Resolution “Deprez” Feature


Authors:

Todd Fast

Chris Webster

Girish


Goals


A Netbeans project can be considered a shared intermittently connected environment. Having to supply XML resources (schema and WSDL) that are not available is a major burden on project collaborators. This specification seeks to preserve the project model for WSDL and schema artifact dependencies. The goal is to ensure resources are available locally and sharable, but can also be easily refreshed as needed, and in a predictable fashion that is consistent with the project development.

  • Provide a standard way to retrieve XML resources from a variety of sources. The retrieval will perform the transitive closure of resources potentially requiring user intervention. The retrieval process will understand the semantics of the retrieved resources and will generate appropriate binding elements if necessary (generate enough information to resolve dependencies).

  • Provide the ability to interact at development time in standard JAXP and Schema validation processes.

  • Provide an interface to allow the referenced resources to be queried. The queries will provide the ability to determine all the mapped resources, all mapped resources which are broken, and the ability to get the closure of all the resolvers reachable via references. A user interface will also be provided to allow the dependency resolvers entries to be changed.

  • Promote local references to better integrate with the project system.

  • Support the above capabilities without internal access to the containing project. This allows the resolution capability to support any project type.


Non Goals


  • Introspection of the referenced artifacts themselves. The resolver is not attempting to provide any indexing capability similar to what is currently done in the Java metamodel via MDR or the indexing capability provided by the JES registry. The dependency resolver can be used to retrieve domain models (such as schema and wsdl) which have strong query capabilities. Thus the path to indexing will be using the domain models themselves.

  • Provide a standardized view of references. The infrastructure will allow references to be resolved (on a per artifact basis), but the appropriate view at the project level (if any) is left to the project type.

  • Provide a single source of all the references in the project. The dependency resolution is specified on a per artifact basis, this eliminate the possibility of naming collisions (references from different artifacts using the same uri becomes problematic in a global repository).

  • Cross project references of artifacts will not be supported initially due to the complexity of the design. May be revisited at later point of time. For now, artifacts that belong to another project will have to be imported inside the current project to work.


Background

XML instance documents and well known definition documents such as XML schema and WSDL require the ability to perform late binding, that is to resolve a reference within another document. There are several variations of resolution depending on the usage context. For example, the instance document schema provides a global attribute schemaLocation which provides a way to specify the location for a given namespace. The schemaLocation is used during a validation to retrieve the appropriate schema. XML schema and WSDL both offer similar features when using other namespaces. WSDL provides the ability to import other WSDL documents (Basic Profile requires that only WSDL documents be imported using this mechanism) and XML schema also provides several ways to import documents (include, import, redefine).


The semantics for using referenced artifacts differs depending on usage but what remains constant is the need to define an additional indirection to the physical file. The location reference is specified as a URI and thus includes both relative locations as well as URL references. Intermittently connected development environments present a challenge to URL referencing, a typical solution is to cache the resource referenced in the URL. Caching provides a way to improve performance (reduce the network access, consider something like the OTA schema which includes over 100 files (fetching a schema like this during each validation would at least introduce a potential performance issue and not function as expected when a developer is not connected to the network). A relative reference may also present a problem in the development environment as the deployed location may differ from the VCS layout. For example, a development environment may have multiple projects which together comprise a deployment. The runtime location may thus differ from the development time location and affect the relative location reference.


Proposal


Infrastructure (Locating and Querying resources)

The deprez infrastructure will provide a factory for locating resources on a per artifact basis. The interface will provide a way to retrieve a resource either as a stream or as a ModelSource. A model source is used as input to the model factories. In addition to providing access to a model, the infrastructure will provide the ability to determine all the referenced URI's, the broken URI's, add and remove URI's, property change events, and the closure of all other dependency resolvers which are reachable from the initial resolver. The interface will also implement the necessary interfaces to interact with both JAXP as well as LSResolver for interaction with schema validation. The method for adding entries to deprez will seamless integrate with the project system.


Resource Retrieval

Create a wizard which would allow documents to be retrieved from (local disks, URL, and ebXML / UDDI repositories). The wizard would retrieve the files from the specified location and copy the files to the specified project location. The wizard would need to have extension points to allow the closure to be determined as the closure will be slightly different for wsdl and schema. The wizard would retrieve the original file, then load the model and determine the resources which need to be retrieved. If the resource is relative, the resource will be retrieved using the base location of the original file. The wizard may also require user intervention after the initial invocation (if there are issues when resolving the transitive closure of referenced resources).


Illustration 1 below shows the entry point into the resource retrieval process. This wizard will collect the information necessary to retrieve the resource, introspect the resource, and transitively retrieve any additional resources necessary. During the retrieval process, additional user prompting may also be necessary. As the prompting may happen recursively and the extent of the prompting will not be known ahead of time, the ability to interact after the initial collection may be required. This could be something similar to what is done for the refactoring preview, which is a special window (docked into the output area). The window would be similar to the mozilla download manager (except that the nodes in the tree could be edited to supply additional information). The UI illustration is only representative of what could be done and not the final design (nor is the interaction described above). The wizard could be launched from the New File wizard but would be more useful to invoke this wizard in areas where new references are created. For example, when creating a new element a type can be referenced if the type is not yet available from the schema a new resource may need to be retrieved. Invoked in this manner, the deprez information could be generated automatically (where the new file wizard would require a separate step). If a resource is located in another project, the appropriate project references would be created (There are currently at least two different ways [although ant project types are more common] to specify project references and no generic api's, so this may require support from the underlying project perhaps by exposing something in the project lookup). Finally, this wizard can store the original references of the resources to provide the ability to easily refresh the resources.



Illustration 1: Resource Collection Panel




Resource Mapping

In addition to retrieving resources, the ability to interact with deprez through the interface is necessary. This is done to either resolve a broken reference, this is one entry point for wizard above, or to edit the set of resources. References can be become broken due to changes in the set of projects (for inter project references), the contents of a project, or changes to the resource itself (changes to import and similar elements). A sample screen shot is shown below. This describes the referenced URI's, the current mapping (if any) and the ability to display only references which are broken as well as purge deprez of unused entries. The ability to launch the resource wizard would be useful here as well. This expected behavior would be to resolve the broken entries either by pointing to existing files or retrieving new files.



Illustration 2: Resource Mapping




Files retrieved via an absolute URI and its closures are considered as read-only resources and will be stored and versioned in a common directory (<project-root>/external-refs/{schema}/{wsdl}). A public catalog file (@ <project-root>/external-refs/public.xml) will have entries for all the artifacts pulled from web. This catalog will be chained to all the peer catalog files so that the lookup in a peer catalog will result in a look up in public catalog automatic. User will not be given option to store these files to any other directory (to enforce read-only'ness). If user wants to edit these files, he/she has to effectively copy it over to some other folder (in project) and then edit. The closure references of such copied file can be handled thru peer catalog file.

Local resources (from hard disk or from other project) can be retrieved transitively using the same wizard. In this case however, user will be allowed to choose a directory where he wants to store the files. Also, these files will not be marked as read-only.


Transitive closure view & fix broken references.


Create a wizard/view for each Schema/WSDL that shows transitive closure information to the user. This interface will look as depicted in Illustration 3.



















Illustration 3: Self explanatory wizard that not only shows transitive closure of a Schema (or WSDL) but also lets user resolve if references are broken.








Implementation Details



Catalog in JAXP and schema Validation

The JAXP API recognizes that URI may require additional mapping and provides the ability to specify either an EntityResolver in the case of DOM or the resolveEntity method for SAX. This capability provides the ability to programmatically resolve references to other resources. This provides the ability to provide an arbitrary algorithm for resolving references; however, this power also requires additional code to be written and does not provide a generic capability to resolve references. The Oasis catalog specification defines an XML format for providing a mapping from the target URI (specified in the source) to the source URI (typically a resource on the users file system). The XML Commons Catalog Resolver provides a resolver which can be used for standard JAXP resolution as well as the JDK 1.5 schema validation (the resolution interface is slightly different here as additional information is provided (both system and public information), this resolver implements both org.xml.sax.EntityResolver and the org.w3c.dom.ls.LSResourceResolver. Using this resolver provides an easy way to externalize entity resolution. As a side note, the XmlValidate ant task provides a way to specify an inline catalog that references multiple external catalogs via catalogpath element.


The catalog specification (http://www.oasis-open.org/committees/download.php/14041/xml-catalogs.html#s.ext.ent) also provides a processing instruction to speccify a catalog file which may be useful when interpreting the document (i.e. <?oasis-xml-catalog catalog="http://example.com/catalog.xml"?> specifies a catalog which could be used for this document). At this time, I am not sure if this processing instruction is used but this may become more interesting in the future.

Project Features

About this Project

XML was started in November 2009, is owned by dstrupl, and has 58 members.
By use of this website, you agree to the NetBeans Policies and Terms of Use (revision 20160708.bf2ac18). © 2014, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo
 
 
Close
loading
Please Confirm
Close