Understanding JAXB: Java Binding Customization

来源:百度文库 编辑:神马文学网 时间:2024/04/25 04:12:42
http://www.onjava.com/pub/a/onjava/2003/12/10/jaxb.html?page=1 by Sayed Hashimi
12/10/2003

Java Architecture for XML Binding (JAXB) is a specification (or standard) that automates the mapping between XML documents and Java objects and vice versa. One of the primary components of JAXB is the schema compiler. The schema compiler is the tool used to generate Java bindings from an XML schema document. If used in its default mode (for non-trivial applications), the compiler usually generates bindings that are awkward to work with. This article will look at various methods you can use to customize the generated bindings.

What is JAXB and Why Do We Need It?

More often than not, when a Java application consumes an XML document, it performs two non-application specific tasks: 1) it converts a request XML document to a set of Java objects and 2) creates a response XML document after it has consumed the request XML document.

An implementation of JAXB will autogenerate Java bindings for you. This means that you can avoid:

  1. Writing a bunch of classes that map to an XML document (binding).
  2. Writing code to take a request XML document and hydrate a set of Java objects (unmarshalling).
  3. Creating a response XML document from a set of Java objects (marshalling).
 

At a high level, JAXB is composed of three components: the binding compiler, the binding runtime framework, and the binding language. The binding compiler is essentially a .jar file that generates Java bindings from an XML schema document. The binding runtime framework is a Java API that handles transforming XML documents to Java objects and Java objects back to XML. The binding language is an XML-based language that allows for customizing the Java bindings. This article is about customizing the Java bindings generated by the binding compiler.

Why Do We Need to Consider Customization?

The binding compiler in the reference implementation of JAXB is the .jar file jaxb-xjc.jar. The responsibility of this .jar file is to take an XML schema document and to generate a Java representation. In generating the Java representation, the compiler uses a set of binding rules that dictate how the binding should be done. For example, as a rule, the binding compiler uses an algorithm to generate Java identifier names (e.g., an interface name) from XML names when a Java identifier name is not specified via the binding language. At times, the autogenerated names are not useful when manipulating Java objects. In non-trivial applications that make use of JAXB, one will need to customize the resulting Java representation, because the default binding rules alone will not suffice. Therefore, it is important to understand how we can control the Java bindings generated by the binding compiler. The JAXB specification mentions two mechanisms that we can use to customize the bindings. I, however, think that there are three. The specification lists:

  • Annotating the XML schema document (inline binding declaration).
  • Supplying the binding compiler an external binding document (external binding declaration).

A very powerful technique not mentioned is:

  • Modifying the XML schema so that you get the binding that you want.

Combing these methods allows us to truly take control of what the binding compiler generates. The sections that follow describe the methods mentioned above with examples. We will start by looking at how we can influence the Java representation of an XML file by modifying the XML schema.

Modify the XML Schema

The binding compiler takes an XSD and generates a Java representation. When we use the Java bindings in our code, we would like the Java objects to have names that make sense. For example, if my XML document has the following structure:

Message Header

It would make sense for me to have a Message object (or interface), a Header object, and a Request object in my Java representation. A lot of times, however (depending on the XML schema), the Java representation (e.g., the names given to Java objects) is not semantically meaningful; using the Java representation is awkward. For example, instead of having an object of the type Request, you may have something called RequestType. When you say Message.getRequest(), you get back a RequestType and not a Request. Although we can get by with using the default Java bindings, we want to get a representation that makes sense. We want the binding compiler to generate a set of objects as if we were hand-coding the bindings ourselves. To that end, let's understand how we can modify the XML schema to get a better Java representation.

It is important to realize that more than one XSD can validate the same XML. For example, I can take an XSD and change how I define the XML components (e.g., an element), without changing the overall structure, and have the XSD validate the same XML. Similarly, I can take two XSDs that can validate the same XML and have the binding compiler generate two very different Java representations for very same XML. Just by changing how we define our XML components while writing the schema can be a savior when it comes to getting the correct Java representation. Let's see how.

We will start by taking an XSD and generating the Java representation of it. Then we will look at the generated classes and see how we interact with the Java bindings in our Java code, and then see if it makes programmatic sense. The XSD is shown in Listing 1, a sample XML file is shown in Listing 2, and the generated PurchaseOrder interface classes are shown in Listing 3.

Listing 1. Purchase Order schema

Listing 2. Purchase Order XML Document

Sayed Hashimi1234 Main StreetJacksonvilleFL32216GodFather515.99

Listing 3. Purchase Order Java Representation

public interface PurchaseOrderType {generated.ItemsType getItems();generated.BillToType getBillTo();}public interface PurchaseOrderextends javax.xml.bind.Element, generated.PurchaseOrderType{}

First, let's look at the generated Java bindings. PurchaseOrder extends PurchaseOrderType and has two accessors; one for Items and one for BillTo. We would use the bindings as:

PurchaseOrder po = factory.createPurchaseOrder();BillToType billTo = po.getBillTo();ItemsType items = po.getItems();

Note that above you are dealing with objects whose names are suffixed with "Type". This clearly does not make programmatic sense. Have another look at the XML document. Just by looking at the XML structure, we can see that it is natural to say:

PurchaseOrder po = factory.createPurchaseOrder();BillTo billTo    = po.getBillTo();Items items      = po.getItems();

It is not meaningful to interact with a BillToType object; we should be dealing with a BillTo object. It is important to understand why the binding compiler generated types whose names were suffixed with "Type". The reason for the "Type" extension has to do with how we defined the XML components in the schema document. For example, in Listing 1, the PurchaseOrder element is an anonymous complex type with two elements (BillTo and Items). Note that the anonymous complex type refers (ref) to the other elements. The schema compiler generates an interface with a suffix of "Type" when it encounters a ref or an anonymous complex type defined as part of an element definition. Observing, this we modify our XSD to that shown in Listing 4.

Listing 4. XSD Without the refs

We have made three very important changes to the initial XSD.

  1. Removed the refs.
  2. Changed the internal PurchaseOrder elements to named complex types.
  3. When using a complex type within another element, we use set the name and type of the element to the name of the complex type.

Now let's see the resulting PurchaseOrder interfaces.

public interface PurchaseOrderextends javax.xml.bind.Element, generated.PurchaseOrderType{}public interface PurchaseOrderType{generated.Items getItems();generated.BillTo getBillTo();}
Note that now we get a binding that is an accurate representation of the XML. The key to getting this binding is to stick to using named complex types and the trick of giving the name and type of an element the same name as the complex type's name (see Listing 4). 

Inline Annotated Schema

Modifying the XSD is one way to control the generated output from the schema compiler, but it is not the only way. In fact, you can't control a lot of things simply by modifying the XSD file. To that end, the schema compiler uses a set of default binding rules when it generates Java bindings. For example, the schema compiler will autogenerate a Java identifier name from an XML name as a default binding rule. Often, it is necessary to override default binding rules either due to naming requirements (e.g., package names) or naming collisions. We can override binding rules in one of two ways: by using an external binding declaration file or by writing binding declaration inline with the XML schema document. Let's see an example where we override binding rules inline with the schema.



Listing 5 shows an XML schema that contains information that may be useful to a drawing application. For clarity, an associated XML document is also shown in Listing 6.

Listing 5. An XML Schema Document for a Drawing Application

Listing 6. Sample XML Document for the Drawing Application

629245

As shown, the element Widgets defines a choice group which can have three possibilities: Rectangle, Square and/or Circle. Each of the above mentioned elements are defined as a complexType. Realize that the shape elements can appear any number of times and in any order. Let's have a look at the generated Java bindings. In particular, look at the methods exposed to the Widgets:

Listing 7. Generated Bindings for the Shapes Example

public interface WidgetsType {java.util.List getRectangleOrSquareOrCircle();public interface Circleextends javax.xml.bind.Element, generated.Circle{}public interface Rectangleextends javax.xml.bind.Element, generated.Rectangle{}public interface Squareextends javax.xml.bind.Element, generated.Square{}}public interface Widgetsextends javax.xml.bind.Element, generated.WidgetsType{}

To start, we have a naming problem. We did not provide a name (via binding declaration) for the choice group, and so the binding compiler had to autogenerate a name. In this case, it took the names of the subelements and combined them with "Or". Hence the accessor name for the list:

java.util.List getRectangleOrSquareOrCircle();

Cleary the above method name is not meaningful and we need to resolve that. Before we do however, let's see how we might use the bindings.

Listing 8. Java Bindings in Action

JAXBContext jc = JAXBContext.newInstance("generated");Unmarshaller u = jc.createUnmarshaller();Widgets widgets = (Widgets)u.unmarshal(new File(xmlfile));List shapes = widgets.getRectangleOrSquareOrCircle();// iterate shapesIterator shapesIt = shapes.iterator();while(shapesIt.hasNext()){Object obj = shapesIt.next();if(obj instanceof Circle){Circle circle = (Circle)obj;// process circle...}else if(obj instanceof Square){Square square = (Square)obj;// process Square}else if(obj instanceof Square){Rectangle rectangle = (Rectangle)obj;// process Rectangle...}else{// throw exception}}

We have already identified that we have a naming problem in the accessor method. Observing the code in Listing 8, we see a fundamental OO problem. We have to get a list of shapes, figure out what the real type of the shape is, and then do something with it. It would be nice if we can avoid this. Specifically, it would be nice if we could have all of the shapes derive from a common base class so we could treat the shapes polymorphically. Listing 9 shows an XSD with annotated binding rules to resolve the issues discussed.

Listing 9. Annotated XSD

We have made two changes to our XSD from Listing 7:

  1. Added a globalBindings element that defines what the base class is for the implementation objects.
  2. Below the choice group, we have given the choice group a name (Shapes).

Listing 10 shows a portion of the generated Java bindings.

Listing 10. Improved Java Bindings

public interface WidgetsType{java.util.List getShapes();public interface Circleextends javax.xml.bind.Element, generated.Circle{}public interface Rectangleextends javax.xml.bind.Element, generated.Rectangle{}public interface Squareextends javax.xml.bind.Element, generated.Square{}}public class SquareImplextends com.syh.Shapeimplements generated.Square, com.sun.xml.bind.JAXBObject,generated.impl.runtime.UnmarshallableObject,generated.impl.runtime.XMLSerializable,generated.impl.runtime.ValidatableObject{}

As shown, we now get an accessor that makes sense. Also shown is one of the shape implementations. As you can see, the shape now derives from com.syh.Shape (as do the other shapes). Now instead of having to determine the underlying shape type, we can treat all shapes polymorphically. We have made great improvements with little changes to the XSD. Unfortunately, the changes do come with a caveat. Note that we have put the base class for the shapes at the globalBindings section. The globalBindings section defines overrides at the global scope. That means that all of the objects generated by the binding compiler will extend com.syh.Shape, not just the shapes. In other words, the implementation of the root element, Widget, also derives from com.syh.Shape. Unfortunately, there is nothing we can do about that at this point. With JAXB, you either get all or none; either all of your objects have to derive from the same base class, or none. You cannot have just one class derive from a base class and not the rest. Perhaps in a later version of JAXB, Sun will resolve this issue. For now, however, we can either live with the Widget being a Shape or modify this by hand.

A note in passing: the objects generated by the schema compiler are not, by default, serializable. This poses a problem if one needs to pass these objects over the network (e.g., if we use them along with our EJBs). Fortunately, we can add another entry to the global bindings section to have all of our objects implement java.io.Serializable. To generate serializable objects, add:

to the globalBindings section. Finally, realize that JAXB is a specification and vendors provide implementation of the JAXB specs. Sun provides a reference implementation of the specification. The two entries we made into the globalBindings section (prefixed with the xjc namespace) are vendor extensions. Any time your schema contains vendor extensions, you have to compile with the -extension switch.

External Binding Declaration

We have already discussed how we can override binding rules by annotating the XML schema. In this section, we'll talk about how one can remove the binding rules out of the schema document and into a separate file (called an external binding file). This technique is useful when you are dealing with large schema documents or when you simply want to separate the two files for clarity. Interestingly, the external binding file is also an XML schema document. The XML schema document contains binding declarations that override the default binding rules. Listing 10 shows the external binding file for our previous example.

Listing 11. Using An External Binding File

The binding file shown above does exactly what the previous example did inline. That is, it overrides bindings so that:

  1. All implementation objects derive from com.syh.Shape.
  2. All implementation objects are serializable.
  3. The name of the choice group is set to Shapes, to ensure that the list is named appropriately (i.e., getShapes()).

There is nothing magical about using an external binding file. The only thing that may throw you off is that you need to understand the basics of XPath, but that's it. An interesting point at this stage is that you can use inline and external binding simultaneously. Moreover, you can separate your external bindings to more than one file, if necessary. The only thing to remember is that when compiling with external bindings you need to use the -b switch for every bindings file. Finally, understand that there is not a restriction on the external binding file's extension. However, by convention, .xjb is used.

Binding Declaration Language

We have shown several examples where we override the default bindings used by the binding compiler. In all of our examples, we intentionally left out the fact that behind all of our overrides is a language called Binding Declaration Language. We deliberately left this out because the majority of readers will want to gain an understanding of JAXB to the level where they can solve their day-to-day problems, without having to become an expert on the topic. Having said that, since we have completed the examples, it makes sense to just mention a few facts about Binding Declaration Language.

  1. The Binding Declaration Language is an XML-based language.
  2. The Binding Declaration Language defines the constructs of binding declarations.
  3. The language is extensible.

Unless you are thinking about writing an implementation of JAXB, you can usually get by just by looking at examples.

Conclusion

We have shown that there are three techniques that one can use to customize the Java bindings produced by the schema compiler. They include:

  1. Modifying the XML schema document.
  2. Using inline annotated binding declarations.
  3. Using external binding files.

A fair question at this point is: when would you use method 1 versus method 2 or 3? Realistically, you have to a combination of methods (1 and either 2 or 3). You have to apply 1, because 1 will drastically improve the Java bindings from the start; you will get bindings that better map to the XML document. You have to understand 1 and begin to write your schema documents following the suggestions outlined in 1. Once you have run through 1, all of your other problems will be solved by either 2 or 3. These usually include naming issues that were not solved with 1.

An interesting point to keep in mind while considering or working with JAXB is that one can easily generate an XSD using a tool; given an XML document, one can generate an XSD. Unfortunately, the XSDs that are generated by tools result in horrible Java bindings using JAXB. The conclusion is that you have to write the schema by hand. The good news is that writing a schema is much simpler and faster that having to hand-code (and test) Java bindings that can handle marshalling and unmarshalling as seamlessly as JAXB can. Having said that, one might ask, "Can I just use an XML document with the schema compiler rather than a schema document?" The definitive answer is no; a schema document is required.

References

  • www-106.ibm.com/developerworks/xml/edu/x-dw-xjaxb-i.html
  • java.sun.com/xml/ns/jaxb
  • www.xmlcom/pub/a/2003/01/08/jaxb-api.html
  • www.onjava.com/pub/a/onjava/2002/03/06/jaxant1.html
  • An XML knowledge folder at Aspire Knowledge Central. This link covers practical issues related to XML programming in Java and C#. The contents at this link gets frequently updated by its author Satya Komatineni.

Sayed Hashimi is an independent consultant based in Jacksonville, Florida.