Oracle Coherence 3.5 : Implementing object serialization (part 3) - PortableObject or PofSerializer, Collection serialization with POF

5/11/2013 2:25:42 AM

4. PortableObject or PofSerializer?

One of the reasons you should choose custom PofSerializer implementation is what we discussed previously—when you have complex creational logic for your objects and need to have complete control over their instantiation.

A second, very obvious reason is that you may not be able to add the necessary POF serialization code to a class. While it is usually possible to add such code to your own classes, being able to have the code in a separate PofSerializer implementation means that you can serialize other peoples' non-POF-compliant classes using POF, including classes that come with the JDK, various frameworks, application servers, and even packaged applications.

For example, we have used the java.util.Currency class within the Money implementation. In order to make Money as a whole portable, we need to ensure that Currency is portable as well. The only way to do that is by creating an external serializer:

public class CurrencySerializer implements PofSerializer {

    public void serialize(PofWriter writer, Object obj) 
                                          throws IOException {
        Currency currency = (Currency) obj;
        
        writer.writeString(0, currency.getCurrencyCode());
        writer.writeRemainder(null);
    }

    public Object deserialize(PofReader reader) 
                                          throws IOException {
        String currencyCode = reader.readString(0);
        pofReader.readRemainder();
        
        return Currency.getInstance(currencyCode);
    }
}

Another good reason to choose an external serializer is when you can write a single serializer that is able to handle many different types. For example, let's take a look at serialization of enum values.

In order to serialize enums in a platform-independent way, the best option is to write the name of the enum value into a POF stream. On the surface, this seems easy enough to do directly within the serialization code:

public enum TransactionType {
    DEPOSIT,
    WITHDRAWAL
}

public class Transaction implements PortableObject {

    // data members
    private TransactionType m_type;

    public Transaction() {
    }

    public void readExternal(PofReader pofReader)
                                throws IOException {
        m_type = Enum.valueOf(TransactionType.class,
                              pofReader.readString(0));
    }

    public void writeExternal(PofWriter pofWriter)
                                throws IOException {
        pofWriter.writeString(0, m_type.name());
    }
}

Unfortunately, there are several issues with the approach.

For one, you will have to repeat a somewhat cumbersome piece of code that is used to deserialize enums for each serializable enum field within your application. Second, if you need to serialize an instance of a collection that contains one or more enum values, you will not have a sufficient degree of control that will allow you to inject the custom deserialization logic used previously.

Because of this, it is much better solution to implement a custom PofSerializer that can be used to serialize all enum types:

public class EnumPofSerializer implements PofSerializer {{

    public void serialize(PofWriter writer, Object o) 
                                         throws IOException {
        if (!o.getClass().isEnum()) {
            throw new IOException(
                    "EnumPofSerializer can only be used to " +
                    "serialize enum types.");        
        }

        writer.writeString(0, ((Enum) o).name());
        writer.writeRemainder(null);
    }

    public Object deserialize(PofReader reader) 
                                         throws IOException {

        PofContext pofContext = reader.getPofContext();
        Class enumType = pofContext.getClass(reader.getUserTypeId());
        if (!enumType.isEnum()) {
            throw new IOException(
                    "EnumPofSerializer can only be used to " +
                    "deserialize enum types.");        
        }

        Enum enumValue = Enum.valueOf(enumType, reader.readString(0));
        reader.readRemainder();
    
        return enumValue;
    }
}

Now all we have to do is register our enum types within the POF configuration, specifying EnumPofSerializer as their serializer, and use read/writeObject methods to serialize them:

public void readExternal(PofReader pofReader)
                            throws IOException {
    m_type = (TransactionType) pofReader.readObject(0);
}

public void writeExternal(PofWriter pofWriter)
                            throws IOException {
    pofWriter.writeObject(0, m_type);
}

This greatly simplifies enum serialization and ensures that they are serialized consistently throughout the application. Better yet, it will allow for a completely transparent serialization of enum values within various collections.

The previous examples demonstrate some great uses for external serializers, but we still haven't answered the question we started this section with—which approach to use when serializing our domain objects.

Implementing the PortableObject interface is quick and easy, and it doesn't require you to configure a serializer for each class. However, it forces you to define a public default constructor, which can be used by anyone to create instances of your domain objects that are in an invalid state.

An external serializer gives you full control over object creation, but is more cumbersome to write and configure. It also might force you to break the encapsulation, as we did earlier, by providing a getAccountIds method on the Customer class in order to allow serialization of account identifiers.

The last problem can be easily solved by implementing the external serializer as a static inner class of the class it serializes. That way, it will be able to access its private members directly, which is really what you want to do within the serializer.

In addition to that, the Coherence Tools project provides AbstractPofSerializer, an abstract base class that makes the implementation of external serializers significantly simpler by removing the need to read and write the remainder. We will actually discuss the implementation of this class shortly, but for now let's see how the customer serializer would look like if implemented as a static inner class that extends AbstractPofSerializer:

public class Customer 
    implements Entity<Long> {
    ...

    public static class Serializer
            extends AbstractPofSerializer<Customer> {

        protected void serializeAttributes(Customer c, PofWriter writer)
                throws IOException {
            writer.writeLong      (0, c.m_id);
            writer.writeString    (1, c.m_name);
            writer.writeString    (2, c.m_email);
            writer.writeObject    (3, c.m_address);
            writer.writeCollection(4, c.m_accountIds);
        }

        protected Customer createInstance(PofReader reader)
                throws IOException {
            return new Customer(
                    reader.readLong(0),
                    reader.readString(1),
                    reader.readString(2),
                    (Address) reader.readObject(3),
                    reader.readCollection(4, new ArrayList<Long>()));
        }
    }
}

I believe you will agree that implementing the external serializer this way is almost as simple as implementing the PortableObject interface directly (we still need to configure the serializer explicitly), but without its downsides.

Because of this, my recommendation is to implement external serializers in the manner presented for all domain classes, and to implement the PortableObject interface directly only within the classes that are closely related to Coherence infrastructure, such as entry processors, filters, and value extractors.

5. Collection serialization with POF

While implementation of the POF serialization code is straightforward, one subject that deserves a more detailed discussion is collection serialization.

POF does not encode collection type into the POF stream. If it did, it wouldn't be portable, as collection types are platform dependent. For example, if it encoded the type of a java.util.LinkedList into the stream, there would be no way for a .NET client to deserialize the collection as there is no built-in linked list type in .NET.

Instead, POF leaves it to the serialization code to decide which collection type to return by providing a collection template to the PofReader.readCollection method:

List myList = pofReader.readCollection(0, new LinkedList());

The situation with maps is similar:

Map myMap = pofReader.readMap(1, new TreeMap());

You can specify null in both cases, but you should never do that if you care about the type of the returned object and not just about the fact that it implements the Collection or Map interface. If you do, Coherence will return an instance of a default type, which is probably not what you want.

For example, you might've noticed that I specified a new ArrayList as a collection template whenever I was reading account ids from the POF stream in the Customer serialization examples. The reason for that is that I need the collection of account ids to be mutable after deserialization (so the customer can open a new account). If I didn't provide a template, account ids would be returned as ImmutableArrayList, one of the internal List implementations within Coherence.

To make things even worse, there is no guarantee that the default implementation will even remain the same across Coherence versions, and you might get surprised if you move from one supported platform to another (for example, the .NET Coherence client returns an object array as a default implementation). The bottom line is that you should always specify a template when reading collections and maps.

The situation with object arrays is similar: if you want the serializer to return an array of a specific type, you should provide a template for it.

myProductArray = pofReader.readObjectArray(2, new Product[0]);

If you know the number of elements ahead of time you should size the array accordingly, but more likely than not you will not have that information. In that case, you can simply create an empty array as in the previous example, and let the POF serializer resize it for you.

The only exception to this rule is if you have written the array into the stream using the PofWriter.writeObjectArray overload that writes out a uniform array, in which case the serializer will have enough information to create an array of a correct type even if you don't specify a template for it.

This brings us to the discussion about uniform versus non-uniform collection and array write methods.

If you browse the API documentation for PofWriter, you will notice that there are multiple overloaded versions of the writeCollection, writeMap, and writeObjectArray methods.

The basic ones simply take the attribute index and value as arguments and they will write the value out using a non-uniform format, which means that the element type will be encoded for each element in a collection. Obviously, this is wasteful if all elements are of the same type, so PofWriter provides methods that allow you to specify element type as well, or in the case of the writeMap method, both the key type and the value type of map entries.

If your collection, array or a map is uniform, and most are, you should always use the uniform versions of write methods and specify the element type explicitly. This can significantly reduce the size of the serialized data by allowing the POF serializer to write type information only once instead of for each element.

Others