Oracle Coherence 3.5 : Implementing Domain Objects - Adding support for schema evolution (part 2) - Implementing serialization for Evolvable objects

6/25/2013 9:04:07 PM

Implementing serialization for Evolvable objects

The Evolvable interface simply defines which information class instances need to be able to provide in order for the class to support schema evolution. The rest of the work is performed by a serializer that knows how to use that information to support serialization across multiple versions of a class.

The easiest way to add schema evolution support to your application is to use an out-of-the-box serializer that implements the necessary logic. One such serializer is the PortableObjectSerializer we discussed earlier, and it makes schema evolution a breeze. You simply implement both PortableObject and Evolvable interfaces within your class (or even simpler, a convenience EvolvablePortableObject interface), and the serializer takes care of the rest.

However, if you follow my earlier advice and implement external serializers for your domain objects, you need to handle object evolution yourself.

The algorithm to implement is fairly simple. When deserializing an object, we need to:

Read the data version from the POF stream and set the dataVersion attribute
Read object attributes as usual
Read the remaining attributes, if any, from the POF stream and set the futureData attribute

The last item is only meaningful when we are deserializing a newer object version. In all other cases futureData will be null.

When serializing an object, we need to do the exact opposite for steps 2 and 3, but the first step is slightly different:

Set the data version of the POF stream to the greater of implementation version or data version
Write object attributes as usual
Write future data into the POF stream

The reason why we need to write the greater of implementation or data version in the first step, is that we always want to have the latest possible version in the POF stream. If we deserialize a newer version of an object, we need to ensure that its version is written into the POF stream when we serialize the object again, as we'll be including its original data into the POF stream as well. On the other hand, if we deserialized an older version, we should write a new version, containing new attributes while serializing the object again.

This is actually the key element of the schema evolution strategy in Coherence that allows us to upgrade the cluster node by node, upgrading the data stored within the cluster in the process as well.

Imagine that you have a ten-node Coherence cluster that you need to upgrade. You can shut a single node down, upgrade it with new JAR files and restart it. Because the data is partitioned across the cluster and there are backup copies available, the loss of a single node is irrelevant—the cluster will repartition itself, backup copies of the data will be promoted to primary copies, and the application or applications using the cluster will be oblivious to the loss of a node.

When an upgraded node rejoins the cluster, it will become responsible for some of the data partitions. As the data it manages is deserialized, instances of new classes will be created and the new attributes will be either calculated or defaulted to their initial values. When those instances are subsequently serialized and stored in the cluster, their version is set to the latest implementation version and any node or client application using one of the older versions of the class will use the futureData attribute to preserve new attributes.

As you go through the same process with the remaining nodes, more and more data will be incrementally upgraded to the latest class version, until eventually all the data in the cluster uses the current version.

What is important to note is that client applications do not need to be simultaneously upgraded to use the new classes. They can continue to use the older versions of the classes and will simply store future data as a binary blob on reads, and include it into the POF stream on writes. As a matter of fact, you can have ten different applications, each using different versions of the data classes, and they will all continue to work just fine, as long as all classes are evolvable.

Now that we have the theory covered, let's see how we would actually implement a serializer for our Customer class to support evolution.

public class CustomerSerializer implements PofSerializer {
    
    public void serialize(PofWriter writer, Object o)
                                    throws IOException {
        Customer c = (Customer) o;

        int dataVersion = Math.max(c.getImplVersion(),
                                   c.getDataVersion());
        writer.setVersionId(dataVersion);

        writer.writeLong(0, c.getId());
        writer.writeString(1, c.getName());
        writer.writeString (2, c.getEmail());
        writer.writeObject(3, c.getAddress());
        writer.writeCollection(4, c.getAccountIds());

        writer.writeRemainder(c.getFutureData());
    }

    public Object deserialize(PofReader reader)
                                    throws IOException {

        Long    id      = reader.readLong(0);
        String  name    = reader.readString(1);
        String  email   = reader.readString (2);
        Address address = (Address) reader.readObject(3);
        Collection<Long> accountIds = 
              reader.readCollection(4, new ArrayList<Long>());
        
        Customer c = new Customer(id, name, email, address, accountIds);
        c.setDataVersion(pofReader.getVersionId());
        c.setFutureData(pofReader.readRemainder());

        return c;
    }
}

The highlighted code is simple, but it is immediately obvious that it has nothing to do with the Customer class per se, as it only depends on the methods defined by the Evolvable interface. As such, it simply begs for refactoring into an abstract base class that we can reuse for all of our serializers:

public abstract class AbstractPofSerializer<T>
        implements PofSerializer {

    protected abstract void 
        serializeAttributes(T obj, PofWriter writer) 
            throws IOException;

    protected abstract void 
        deserializeAttributes(T obj, PofReader reader)
            throws IOException;

    protected abstract T createInstance(PofReader reader)
            throws IOException;

    public void serialize(PofWriter writer, Object obj)
                                    throws IOException {
        T instance = (T) obj;
        boolean isEvolvable = obj instanceof Evolvable;
        Evolvable evolvable = null;

        if (isEvolvable) {
            evolvable = (Evolvable) obj;
            int dataVersion = Math.max(
                                  evolvable.getImplVersion(),
                                  evolvable.getDataVersion());
            writer.setVersionId(dataVersion);
        }

        serializeAttributes(instance, writer);

        Binary futureData = isEvolvable
                            ? evolvable.getFutureData()
                            : null;
        writer.writeRemainder(futureData);
    }
    public Object deserialize(PofReader reader)
                                    throws IOException {
        T instance = createInstance(reader);

        Evolvable evolvable   = null;
        boolean   isEvolvable = instance instanceof Evolvable;
        if (isEvolvable) {
            evolvable = (Evolvable) instance;
            evolvable.setDataVersion(
                             reader.getVersionId());
        }

        deserializeAttributes(instance, reader);

        Binary futureData = reader.readRemainder();
        if (isEvolvable) {
            evolvable.setFutureData(futureData);
        }

        return instance;
    }
}

The only thing worth pointing out is the fact that both the createInstance method and deserializeAttributes method read attributes from the POF stream. The difference between the two is that createInstance should only read the attributes that are necessary for instance creation, such as constructor or factory method arguments. All other object attributes should be read from the stream and set within the deserializeAttributes method.

Others