1. Fleshing out the syntax
Let’s imagine we haven’t
already seen the DSL syntax for the Scheduling DSL, and that we need to
start building such a thing from scratch. Before we begin the actual
implementation, we need to know what we want to do in our DSL:
Define named tasks
Specify what happens when a task is executed
Define when a task should execute
Describe the conditions for executing the task
Define the recurrence pattern (how often we will repeat the task)
We also need to look at those goals from the appropriate perspective—the end user’s. The client will pay for the DSL, but the end users
are the people who will end up using the DSL. There is a distinct
difference between the two. Identifying who the end user is can be a
chore, but it’s important to accurately identify who the users of the
DSL will be.
One of the major reasons
to build a DSL is to hide the complexities of the implementation with a
language that makes sense to the domain experts. If you get the wrong
idea about who is going to use the DSL, you will create something that
is harder to use. The budgeting people generally have a much fuzzier
notion about what their company is doing than the people actually doing
the work. Once you have some idea about what the end users want, you can
start the design and implementation.
I try to start using a
declarative approach. It makes it easier to abstract all the details
that aren’t important for the users of the DSL when they are writing
scripts. That means deciding what the DSL should do. After I have
identified what I want to use
the DSL for, I can start working on the syntax. It’s usually easier to
go from an example of how you want to specify things to the syntax than
it is to go the other way.
One technique that I have
found useful is to pretend that I have a program that can perfectly
understand intent in plain English and execute it. For the Scheduling
DSL, the input for that program might look like the following:
Define a task named: "warn if website is down", starting from now, running
every 3 minutes. When website "http://example.org" is not alive, then
notify "admin@example.org" that the server is down.
This syntax should cover a
single scenario of using the Scheduling DSL, not all scenarios. The
scenario should also be very specific. Notice that I’ve included the URL
and email address in the scenario, to make it more detailed.
You should flesh out the DSL in
small stages, to make it easier to implement and to discover the right
language semantics. You should also make it clear that you’re talking
about a specific usage instance, and not the general syntax definition.
Once you have the
scenario description, you can start breaking into lines, and indenting
by action groups. This allows you to see the language syntax more
clearly:
Define a task named: "warn if website is down",
starting from now,
running every 3 minutes.
When web site "http://example.org" is not alive
then notify "admin@example.org" that the server is down.
Now it looks a lot more
structured, doesn’t it? After this step, it’s a matter of turning the
natural language into something that you can build an internal DSL on.
This requires some level of expertise, but mostly it requires knowing
the syntax and what you can get away with.
2. Choosing between imperative and declarative DSLs
There are two main styles for
building DSLs: imperative and declarative. Each of the three
DSL types can be implemented using either style, although there is a
tendency to use a more imperative approach for technical DSLs and a more
declarative approach for business DSLs.
An imperative DSL
specifies a list of steps to execute (to output text using a templating
DSL, for example). With this style, you specify what should happen.
A declarative DSL
is a specification of a goal. This specification is then executed by
the supporting infrastructure. With this style, you specify the intended
result.
The difference is really in the intention. Imperative DSLs usually specify what to do, and declarative DSLs specify what you want done.
SQL
and regular expressions are examples of declarative DSLs. They both
describe what you want done, but not how to do it. Build scripts are
great example of imperative DSLs. It doesn’t matter what build engine
you use (NAnt, Rake, Make), the build script lists actions that need to
be executed in a specified order. There are also hybrid DSLs, which are a
mix of the two. They are DSLs that specify what you want done, but they
also have some explicit actions to execute.
Usually, with declarative
DSLs, there are several steps along the way to the final execution. For
example, SQL is a DSL that uses the declarative style. With SQL you can
specify what properties you want to select and according to what
criteria. You then let the database engine handle the loading of the
data.
When you use an imperative DSL, the DSL directly dictates what will happen, as illustrated in figure 1.
When you use a
declarative DSL, the DSL specifies the desired output, and there is an
engine that takes any actions required to make it so. There isn’t
necessarily a one-to-one mapping between the output that the DSL
requests and the actions that the engine takes, as illustrated in figure 2.
You have to decide which
type of DSL you want to build. Imperative DSLs are good if you want a
simple-to-understand but open-ended solution. Declarative DSLs work well
when the problem itself is complex, but you can express the
specification for the solution in a clear manner.
Regardless of which type of DSL
you decide to build, you need to be careful not to leak implementation
details into the DSL syntax. Doing so will generally make it harder to
modify the DSL in the long run, and likely will confuse the users. DSLs
should deal with the abstract concepts, such as applying free shipping,
or suggesting registration as a preferred customer, and leave the implementation
of those concepts to the application itself.
Sometimes I build declarative
DSLs, and more often hybrid DSLs (more on them in a minute). Usually the
result of my DSLs is an object graph describing the intent of the user
that I can feed into an engine that knows how to deal with it. The DSL
portion is responsible for setting this up, and not much more.
I rarely find a use for
imperative DSLs. When I use them, it’s usually in some sort of helper
functionality: text generation, file processing, and the like. A
declarative DSL is more interesting, because it’s usually used to
express the complex scenarios.
I
don’t write a lot of purely declarative DSLs. While those are quite
interesting in the abstract, getting them to work in real-world
scenarios can be hard. But mixing the styles, creating a hybrid DSL, is a powerful combination.
A hybrid DSL
is a declarative DSL that uses imperative programming approaches to
reach the final state that’s passed to the backend engine for
processing. For example, consider this rule: “All preferred customers
get 2 percent additional discount on large orders on Sunday.” That rule
is expressed in listing 1 using a hybrid of declarative and imperative styles (look at the third line):
Listing 1. A hybrid DSL, using both imperative and declarative concepts
when User.IsPreferred and Order.TotalCost > 1000:
AddDiscountPercentage 5
AddDiscountPercentage 2 if today is sunday
ApplyFreeShipping
|
Note that this example uses
the same syntax as before, but we’re adding additional conditionals to
the mix—we’re mixing both styles. This is a silly example of the power
of hybrid DSLs, but the ability to express control flow (loops and if
constructs) and to have access to declarative concepts makes a hybrid
DSL a natural for specifying behavior in more complex scenarios, and it
can do so coherently.
Before we move on, listing 2 shows another approach, arguably a more declarative one, for the same problem.
Listing 2. A more declarative approach to specifying rules
applyDiscount 5.percent:
when User.IsPreferred and Order.TotalCost > 1000
suggestPreferred:
when not User.IsPreferred and Order.TotalCost > 1000
freeShipping:
when Order.TotalCost > 500 and User.IsNotPreferred
when Order.TotalCost > 1000 and User.IsPreferred
|
I find the example in listing 2 to be more expressive, because it explicitly breaks away from the developer mentality of ifs and branches and forces you to think about actions and triggers, which is probably a better model for this particular problem.
Nevertheless,
both examples perform the exact same operations, and are equivalent in
terms of complexity and usage. In fact, there is a one-to-one mapping
between the two.
That’s enough theory; let’s pull the concepts of a DSL apart, and see how it works.