The drive toward DSLs : Choosing between imperative and declarative DSLs

6/8/2012 11:32:36 AM

1. Fleshing out the syntax

Let’s imagine we haven’t already seen the DSL syntax for the Scheduling DSL, and that we need to start building such a thing from scratch. Before we begin the actual implementation, we need to know what we want to do in our DSL:

Define named tasks
Specify what happens when a task is executed
Define when a task should execute
Describe the conditions for executing the task
Define the recurrence pattern (how often we will repeat the task)

We also need to look at those goals from the appropriate perspective—the end user’s. The client will pay for the DSL, but the end users are the people who will end up using the DSL. There is a distinct difference between the two. Identifying who the end user is can be a chore, but it’s important to accurately identify who the users of the DSL will be.

One of the major reasons to build a DSL is to hide the complexities of the implementation with a language that makes sense to the domain experts. If you get the wrong idea about who is going to use the DSL, you will create something that is harder to use. The budgeting people generally have a much fuzzier notion about what their company is doing than the people actually doing the work. Once you have some idea about what the end users want, you can start the design and implementation.

I try to start using a declarative approach. It makes it easier to abstract all the details that aren’t important for the users of the DSL when they are writing scripts. That means deciding what the DSL should do. After I have identified what I want to use the DSL for, I can start working on the syntax. It’s usually easier to go from an example of how you want to specify things to the syntax than it is to go the other way.

One technique that I have found useful is to pretend that I have a program that can perfectly understand intent in plain English and execute it. For the Scheduling DSL, the input for that program might look like the following:

Define a task named: "warn if website is down", starting from now, running
     every 3 minutes. When website "http://example.org" is not alive, then
     notify "admin@example.org" that the server is down.

This syntax should cover a single scenario of using the Scheduling DSL, not all scenarios. The scenario should also be very specific. Notice that I’ve included the URL and email address in the scenario, to make it more detailed.

You should flesh out the DSL in small stages, to make it easier to implement and to discover the right language semantics. You should also make it clear that you’re talking about a specific usage instance, and not the general syntax definition.

Once you have the scenario description, you can start breaking into lines, and indenting by action groups. This allows you to see the language syntax more clearly:

Define a task named: "warn if website is down",
        starting from now,
        running every 3 minutes.
        When web site "http://example.org" is not alive
        then notify "admin@example.org" that the server is down.

Now it looks a lot more structured, doesn’t it? After this step, it’s a matter of turning the natural language into something that you can build an internal DSL on. This requires some level of expertise, but mostly it requires knowing the syntax and what you can get away with.

2. Choosing between imperative and declarative DSLs

There are two main styles for building DSLs: imperative and declarative. Each of the three DSL types can be implemented using either style, although there is a tendency to use a more imperative approach for technical DSLs and a more declarative approach for business DSLs.

An imperative DSL specifies a list of steps to execute (to output text using a templating DSL, for example). With this style, you specify what should happen.
A declarative DSL is a specification of a goal. This specification is then executed by the supporting infrastructure. With this style, you specify the intended result.

The difference is really in the intention. Imperative DSLs usually specify what to do, and declarative DSLs specify what you want done.

SQL and regular expressions are examples of declarative DSLs. They both describe what you want done, but not how to do it. Build scripts are great example of imperative DSLs. It doesn’t matter what build engine you use (NAnt, Rake, Make), the build script lists actions that need to be executed in a specified order. There are also hybrid DSLs, which are a mix of the two. They are DSLs that specify what you want done, but they also have some explicit actions to execute.

Usually, with declarative DSLs, there are several steps along the way to the final execution. For example, SQL is a DSL that uses the declarative style. With SQL you can specify what properties you want to select and according to what criteria. You then let the database engine handle the loading of the data.

When you use an imperative DSL, the DSL directly dictates what will happen, as illustrated in figure 1.

Figure 1. Standard operating procedure for imperative DSLs

When you use a declarative DSL, the DSL specifies the desired output, and there is an engine that takes any actions required to make it so. There isn’t necessarily a one-to-one mapping between the output that the DSL requests and the actions that the engine takes, as illustrated in figure 2.

Figure 2. Standard operating procedure for declarative DSLs

You have to decide which type of DSL you want to build. Imperative DSLs are good if you want a simple-to-understand but open-ended solution. Declarative DSLs work well when the problem itself is complex, but you can express the specification for the solution in a clear manner.

Regardless of which type of DSL you decide to build, you need to be careful not to leak implementation details into the DSL syntax. Doing so will generally make it harder to modify the DSL in the long run, and likely will confuse the users. DSLs should deal with the abstract concepts, such as applying free shipping, or suggesting registration as a preferred customer, and leave the implementation of those concepts to the application itself.

Sometimes I build declarative DSLs, and more often hybrid DSLs (more on them in a minute). Usually the result of my DSLs is an object graph describing the intent of the user that I can feed into an engine that knows how to deal with it. The DSL portion is responsible for setting this up, and not much more.

I rarely find a use for imperative DSLs. When I use them, it’s usually in some sort of helper functionality: text generation, file processing, and the like. A declarative DSL is more interesting, because it’s usually used to express the complex scenarios.

I don’t write a lot of purely declarative DSLs. While those are quite interesting in the abstract, getting them to work in real-world scenarios can be hard. But mixing the styles, creating a hybrid DSL, is a powerful combination.

A hybrid DSL is a declarative DSL that uses imperative programming approaches to reach the final state that’s passed to the backend engine for processing. For example, consider this rule: “All preferred customers get 2 percent additional discount on large orders on Sunday.” That rule is expressed in listing 1 using a hybrid of declarative and imperative styles (look at the third line):

Listing 1. A hybrid DSL, using both imperative and declarative concepts

when User.IsPreferred and Order.TotalCost > 1000:
    AddDiscountPercentage  5
    AddDiscountPercentage  2 if today is sunday
    ApplyFreeShipping

Note that this example uses the same syntax as before, but we’re adding additional conditionals to the mix—we’re mixing both styles. This is a silly example of the power of hybrid DSLs, but the ability to express control flow (loops and if constructs) and to have access to declarative concepts makes a hybrid DSL a natural for specifying behavior in more complex scenarios, and it can do so coherently.

Before we move on, listing 2 shows another approach, arguably a more declarative one, for the same problem.

Listing 2. A more declarative approach to specifying rules

applyDiscount 5.percent:
    when User.IsPreferred and Order.TotalCost > 1000
suggestPreferred:
     when not User.IsPreferred and Order.TotalCost > 1000
freeShipping:
     when Order.TotalCost > 500 and User.IsNotPreferred
     when Order.TotalCost > 1000 and User.IsPreferred

I find the example in listing 2 to be more expressive, because it explicitly breaks away from the developer mentality of ifs and branches and forces you to think about actions and triggers, which is probably a better model for this particular problem.

Nevertheless, both examples perform the exact same operations, and are equivalent in terms of complexity and usage. In fact, there is a one-to-one mapping between the two.

That’s enough theory; let’s pull the concepts of a DSL apart, and see how it works.