Installing Coherence
In order to install
Coherence you need to download the latest release from the Oracle
Technology Network (OTN) website. The easiest way to do so is by
following the link from the main Coherence page on OTN. At the time of
this writing, this page was located at http://www.oracle.com/technology/products/coherence/index.html,
but that might change. If it does, you can find its new location by
searching for 'Oracle Coherence' using your favorite search engine.
In order to download Coherence for evaluation, you will need to have an Oracle Technology Network (OTN) account. If you don't have one, registration is easy and completely free.
Once you are logged in, you
will be able to access the Coherence download page, where you will find
the download links for all available Coherence releases: one for Java,
one for .NET, and one for each of the supported C++ platforms.
Coherence ships as a single ZIP archive. Once you unpack it you should see the README.txt file containing the full product name and version number, and a single directory named coherence. Copy the contents of the coherence directory to a location of your choice on your hard drive. The common location on Windows is c:\coherence and on Unix/Linux /opt/coherence, but you are free to put it wherever you want.
The last thing you need to do is to configure the environment variable COHERENCE_HOME to point to the top-level Coherence directory created in the previous step, and you are done.
Coherence is a Java application, so you also need to ensure that you have the Java SDK 1.4.2 or later installed and that JAVA_HOME environment variable is properly set to point to the Java SDK installation directory.
If you are using a JVM other than
Sun's, you might need to edit the scripts used in the following
section. For example, not all JVMs support the -server option that is used while starting the Coherence nodes, so you might need to remove it.
What's in the box?
The first thing you should do
after installing Coherence is become familiar with the structure of the
Coherence installation directory.
There are four subdirectories within the Coherence home directory:
bin: This contains a number of useful
batch files for Windows and shell scripts for Unix/Linux that can be
used to start Coherence nodes or to perform various network tests
doc: This contains the
Coherence API documentation, as well as links to online copies of
Release Notes, User Guide, and Frequently Asked Questions documents
examples: This contains several basic examples of Coherence functionality
lib: This contains JAR files that implement Coherence functionality
Shell scripts on Unix
If you are on a Unix-based system, you will need to add execute permission to the shell scripts in the bin directory by executing the following command:
$ chmod u+x *.sh
Starting up the Coherence cluster
In order to get the Coherence
cluster up and running, you need to start one or more Coherence nodes.
The Coherence nodes can run on a single physical machine, or on many
physical machines that are on the same network. The latter will
definitely be the case for a production deployment, but for development
purposes you will likely want to limit the cluster to a single desktop
or laptop.
The easiest way to start a Coherence node is to run cache-server.cmd batch file on Windows or cache-server.sh shell script on Unix. The end result in either case should be similar to the following screenshot:
There is quite a bit of
information on this screen, and over time you will become familiar with
each section. For now, notice two things:
At the very top of the
screen, you can see the information about the Coherence version that you
are using, as well as the specific edition and the mode that the node
is running in. Notice that by default you are using the most powerful,
Grid Edition, in development mode.
The MasterMemberSet
section towards the bottom lists all members of the cluster and
provides some useful information about the current and the oldest member
of the cluster.
Now that we have a single Coherence node running, let's start another one by running the cache-server script in a different terminal window.
For the most part, the
output should be very similar to the previous screen, but if everything
has gone according to the plan, the MasterMemberSet section should reflect the fact that the second node has joined the cluster:
MasterMemberSet
(
ThisMember=Member(Id=2, ...)
OldestMember=Member(Id=1, ...)
RecycleMillis=120000
RecycleSet=MemberSet(Size=0, BitSetCount=0)
)
You should also see several
log messages on the first node's console, letting you know that another
node has joined the cluster and that some of the distributed cache
partitions were transferred to it.
If you can see these log messages on the first node, as well as two members within the ActualMemberSet on the second node, congratulations-you have a working Coherence cluster.
Troubleshooting cluster start-up
In some cases, a Coherence
node will not be able to start or to join the cluster. In general, the
reason for this could be all kinds of networking-related issues, but in
practice a few issues are responsible for the vast majority of problems.
Multicast issues
By far the most common
issue is that multicast is disabled on the machine. By default,
Coherence uses multicast for its cluster join protocol, and it will not
be able to form the cluster unless it is enabled. You can easily check
if multicast is enabled and working properly by running the multicast-test shell script within the bin directory.
If you are unable to start
the cluster on a single machine, you can execute the following command
from your Coherence home directory:
$ . bin/multicast-test.sh ttl 0
This will limit
time-to-live of multicast packets to the local machine and allow you to
test multicast in isolation. If everything is working properly, you
should see a result similar to the following:
Starting test on ip=Aleks-Mac-Pro.home/192.168.1.7, group=/237.0.0.1:9000, ttl=0
Configuring multicast socket...
Starting listener...
Fri Aug 07 13:44:44 EDT 2009: Sent packet 1.
Fri Aug 07 13:44:44 EDT 2009: Received test packet 1 from self
Fri Aug 07 13:44:46 EDT 2009: Sent packet 2.
Fri Aug 07 13:44:46 EDT 2009: Received test packet 2 from self
Fri Aug 07 13:44:48 EDT 2009: Sent packet 3.
Fri Aug 07 13:44:48 EDT 2009: Received test packet 3 from self
If the output is different from the above, it is likely that multicast is not working properly or is disabled on your machine.
This is frequently the
result of a firewall or VPN software running, so the first
troubleshooting step would be to disable such software and retry. If you
determine that was indeed the cause of the problem you have two
options. The first, and obvious one, is to turn the offending software
off while using Coherence.
However, for various
reasons that might not be an acceptable solution, in which case you will
need to change the default Coherence behavior, and tell it to use the Well-Known Addresses (WKA) feature instead of multicast for the cluster join protocol.
Doing so on a development machine is very simple-all you need to do is add the following argument to the JAVA_OPTS variable within the cache-server shell script:
-Dtangosol.coherence.wka=localhost
With that in place, you should be able to start Coherence nodes even if multicast is disabled.
Localhost and loopback address
On some systems, localhost maps to a loopback address, 127.0.0.1.
If that's the case, you will have to specify the actual IP address or host name for the tangosol.coherence.wka
configuration parameter. The host name should be preferred, as the IP
address can change as you move from network to network, or if your
machine leases an IP address from a DHCP server.
As a side note, you can tell
whether the WKA or multicast is being used for the cluster join protocol
by looking at the section above the MasterMemberSet section when the Coherence node starts.
If multicast is used, you will see something similar to the following:
Group{Address=224.3.5.1, Port=35461, TTL=4}
The actual multicast group
address and port depend on the Coherence version being used. As a matter
of fact, you can even tell the exact version and the build number from
the preceding information. In this particular case, I am using Coherence
3.5.1 release, build 461.
This is done in order to
prevent accidental joins of cluster members into an existing cluster.
For example, you wouldn't want a node in the development environment
using newer version of Coherence that you are evaluating to join the
existing production cluster, which could easily happen if the multicast
group address remained the same.
On the other hand, if you are using WKA, you should see output similar to the following instead:
WellKnownAddressList(Size=1,
WKA{Address=192.168.1.7, Port=8088}
)
Using the WKA
feature completely disables multicast in a Coherence cluster, and is
recommended for most production deployments, primarily due to the fact
that many production environments prohibit multicast traffic altogether,
and that some network switches do not route multicast traffic properly.
Binding issues
Another issue that sometimes
comes up is that one of the ports that Coherence attempts to bind to is
already in use and you see a bind exception when attempting to start the
node.
By default, Coherence starts the first node on port 8088,
and increments port number by one for each subsequent node on the same
machine. If for some reason that doesn't work for you, you need to
identify a range of available ports for as many nodes as you are
planning to start (both UDP and TCP ports with the same numbers must be
available), and tell Coherence which port to use for the first node by
specifying the tangosol.coherence.localport system property. For example, if you want Coherence to use port 9100 for the first node, you will need to add the following argument to the JAVA_OPTS variable in the cache-server shell script:
-Dtangosol.coherence.localport=9100