DSL development: 7 recommendations for Domain Specific Language design based on Domain-Driven Design

来源:百度文库 编辑:神马文学网 时间:2024/04/28 13:52:54

DSL development: 7 recommendations for Domain Specific Language design based on Domain-Driven Design

从 The Enterprise Architect Theterm Domain-Specific Language (DSL) is heard a lot nowadays. A DSL is alanguage developed to address the need of a given domain. This domaincan be a problem domain (e.g. insurance, healthcare, transportation) ora system aspect (e.g. data, presentation, business logic, workflow).The idea is to have a language with limited concepts which are allfocused on a specific domain. This leads to higher level languagesimproving developer productivity and communication with domain experts.In a lot of cases it is even possible to let domain experts use the DSLand develop applications.

The question for this article is: how to develop a Domain-Specific Language?

I'll first explain the DSL lifecycle, consisting of the phases:decision, analysis, design, implementation, deployment, andmaintenance. Afterwards I'll give 7 recommendations for DSL developmentbased on my experiences with developing non-trivial DSLs.

The DSL lifecycle

Accordingto Mernik et al. [1] the DSL life cycle consists of five developmentphases: decision, analysis, design, implementation and deployment.Eelco Visser [2] adds maintenance as the sixth phase in the lifecycleof DSLs. Note that in practice DSL Development isn't a sequentialprocess, the phases should be applied iteratively.

Let's look at each phase in more detail.

1. Decision

The development of a DSL starts with the decision to develop a DSL, toreuse an existing one, or to use a GPL. If a domain is very fresh andlittle knowledge is available, it doesn't make sense to startdeveloping a DSL. In order to determine the basic concepts of thefield, first the regular software engineering process should be appliedand a code base supported with libraries should be developed [2].

In other words: if you never have developed an application for acertain domain by hand and you have no existing code base, it isn'tsmart to start implementing a DSL and its associated code generators orexecution engine.

The situation differs of course for non-executable DSLs. However, asyou need experience with existing code for executable DSL, along thesame lines you'll need a deep understanding of the domain you aremodeling for non-executable DSLs.

2. Analysis

In the analysis phase the problem domain is identified and domainknowledge is gathered. The output of formal domain analysis is a domainmodel consisting of [1]:

  • a domain definition, defining the scope of the domain,
  • domain terminology (vocabulary, ontology),
  • descriptions of domain concepts, and
  • feature models describing the commonalities and variabilities of domain concepts and their interdependencies.

The information gathered in this phase can be used to develop theactual DSL. Variabilities indicate what elements should be specified inthe DSL, while commonalities are used to define the execution engine ordomain framework.

If you, for example, analyze a couple of existing code bases in acertain domain, you can split the elements of this code in two parts:the parts that differ and the parts that are the same for each codebase. The static parts (the commonalities) can, depending on yourimplementation approach, be part of the execution engine interpretingthe DSL or can be put in a domain framework which is used by thegenerated code. The parts that differ (the variabilities) should bespecified in the DSL, these are the parts which a user of the DSL needsto ‘configure'.

Eelco Visser [2] recommends an inductive approach which, in oppositeto designing the complete DSL before implementation, incrementallyintroduces abstractions that allow to capture a set of commonprogramming patterns in software development for a particular domain.He also states that developing the DSL in iterations can mitigate therisk of failure. Instead of a big project that produces a functionalDSL in the end, an iterative process produces useful DSLs forsub-domains early on.

In the second part of this article I will give 7 additionalrecommendations for the analysis and design phase of a DLS, based on myown experiences.

3. Design

Approaches to DSL design can be characterized along two orthogonaldimensions: the relationship between the DSL and existing languages,and the formal nature of the design description [1]. A DSL can bedesigned from scratch or it can be easier to base it on an existinglanguage.

Mernik et al. [1] identify three different patterns of design based on existing languages:

  • piggyback: existing language is partially used,
  • specialization: existing language is restricted, and
  • extension: existing language is extended.

Besides the relation with existing languages the formal nature can range between:

  • informal: a DSL specified in natural language and/or with examples, and
  • formal: a DSL specified using one of the available semantic definition methods, e.g. regular expressions, grammars, etc.

It is important to decide what approach to take, however, it is maybe even more important to keep this lesson in mind [3]:

Lesson T2: You are almost never designing a programming language.
Most DSL designers come from language design backgrounds. There theadmirable principles of orthogonality and economy of form are notnecessarily well-applied to DSL design. Especially in catering to thepre-existing jargon and notations of the domain, one must be carefulnot to embellish or over-generalize the language.

Lesson T2 Corollary: Design only what is necessary. Learn to recognize your tendency to over-design.

4. Implementation

For executable DSLs the most suitable implementation approach shouldbe chosen. Mernik et al. [1] identify seven different implementationpatterns, all with different characteristics:

  • interpreter, DSL constructs are recognized and interpreted using a standard fetch-decode-execute cycle. With this pattern no transformation takes place, the model is directly executable.
  • compiler/application generator, DSL constructs are translated to base language constructs and library calls. People are mostly talking about code generation when pointing at this implementation pattern.
  • preprocessor, DSL constructs are translated to constructs in an existing language (the base language). Static analysis is limited to that done by the base language processor.
  • embedding, DSL constructs are embedded in an existing GPL (the host language) by defining new abstract data types and operators. A basic example are application libraries. This type of DSL is mostly called an internal DSL.
  • extensible compiler/interpreter, a GPL compiler/interpreter is extended with domain-specific optimization rules and/or domain-specific code generation. While interpreters are usually relatively easy to extend, extending compilers is hard unless they were designed with extension in mind.
  • commercial off-the-shelf, existing tools and/or notations are applied to a specific domain. You don't have to define your DSL, editor and DSL implementation approach yourself, you just make use of a Model Driven Software Factory. You can, for example, use the Mendix Model-Driven Enterprise Application Platform targeted at the domain of Service-Oriented Business Applications.
  • hybrid, a combination of the above approaches.

While the different approaches can make a big difference in the totaleffort to be invested in DSL development, the choice for a particularapproach is very important.

5. Deployment

In the deployment phase the DSLs and the applications constructedwith them are used. Developers and/or domain experts use the DSLs tospecify models. These models are implemented with one of theimplementation patterns presented in the previous section (e.g. themodels are interpreted by an engine). Such an implementation results inworking software which is used by end-users.

6. Maintenance

While domain experts themselves can understand, validate, and modifythe software by adapting the models expressed in DSLs, modificationsare easier to make and their impact is easier to understand. However,more substantial changes in the software may involve altering the DSLimplementation. So, like any other element of software a DSL willevolve over time. Therefore having a DSL migration strategy is very important.

Besides migration strategies, I have two recommendations which alleviate the maintenance risks of DSLs:

  • use a good DSL tool, which at least generates editors and generators / interpreters from your language definition, and
  • implement models specified in different DSLs with different loosely coupled engines. In this way maintenance of a specific DSL or its compiler does not affect the whole system and the engines can make use of different technologies. In case of business applications, this calls for an architecture encouraging the development of Service-Oriented Business Applications, i.e. an application composed of multiple loosely-coupled services.

Seven Domain-Driven Design based recommendations for DSL Development

Nowthe lifecycle of a DSL is clear I want to share some of my experienceswith the analysis and design phases of the DSL lifecycle. The otherphases are left for a future article.

Before going into the details of DSL design let's try to understand thecontext of these experiences. First of all, they are focused oncreating multiple connected DSLs,i.e. you can create models expressed with different DSLs referring toeach other. For example, in a Form model you can refer to elements fromyour Data model. More specifically, we are talking about a set of DSLs covering all system aspects of a Service-Oriented Business Application.

Another important point in the DSLs we're talking about is that they are all aimed at non-programmer domain experts.For most cases this means domain experts can create models expressed inthese DSLs, in a few cases this means they can at least read them. Thisof course always leads to finding a balance between flexibility andcomplexity.

Based on my experiences, influenced by the concepts of Domain-DrivenDesign [4], I have the following 7 recommendations for DSL development:

1. Capture domain knowledge in a metamodel

If you talk about models for DSLs you will stumble upon the termmetamodel. For a lot of people this sounds scary enough to stopreading. However, it's just a model of the abstract structure of thelanguage. In other words: a metamodel models the concepts of a languageand their relationships. Just as you model the concepts ‘Order',‘Product', and ‘Customer' if you are building software like an orderentry portal.

A metamodel is essential for constructing a DSL. It captures theknowledge of the domain the DSL is aimed at. The model reflects how theteam developing the DSL structures the domain knowledge and what theysee as the most important elements. The binding of model andimplementation ensures that the experiences with earlier versions ofthe DSL can be used as feedback in the modeling process.

2. Communicate using an ubiquitous language

The metamodel is also important for communication purposes. Whendesigning a DSL a lot of communication is needed between the users ofthe language (domain experts) and the developers. The metamodel is thebackbone of a language used by all team members.Because the model is bound to the implementation, developers can talkabout the DSL in this language. They can communicate with domainexperts without translation.

You should play with the model when talking about the DSL. If you can'ttalk in terms of the model about a scenario, the model should beadapted until you can. If the domain experts don't understand themodel, there is something wrong with the model. Domain experts shouldobject to terms or structures that are awkward or inadequate to conveydomain understanding. Developers should watch for ambiguity orinconsistency that will trip up design.

3. Let the metamodel drive the implementation

Don't forget that a language definition is more than just a metamodel (abstract syntax). A language definition also contains a concrete syntax and semantics. When designing and implementing a DSL the concrete syntax is captured in the solution workbench,i.e. an environment in which you can specify models using the DSL witheither a textual or a graphical concrete syntax. The semantics of thelanguage are captured in the transformation rules or model interpreter(based on the used implementation pattern, see above).

It is important that the metamodel drives the implementation of thesolution workbench and interpreter, i.e. the metamodel should driventhe implementation of the DSL. If the implementation doesn't map to themetamodel, the metamodel is of little value. At the same time, complexmappings between metamodel and implementation are difficult tounderstand and in practice difficult to maintain as the design changes.A deadly gap between metamodel and implementation opens, so thatinsight gained in each of those activities does not feed into theother.

Therefore, design the metamodel in such a way that it reflects theimplementation in a very literal way. However, demand at the same timethat a single metamodel also serves the purpose of supporting theubiquitous language. The implementation must become an expression ofthe metamodel, so a change to the code may be a change tot themetamodel and the other way around. To tie the DSL implementation andmetamodel in such a way, usually requires DSL toolsthat let you generate big parts of the DSL implementation from themetamodel. Figure 1 exhibits such a scenario in which part of thesolution workbench and interpreter are generated from the metamodel.

Figure 1 - Metamodel-driven DSL implementation

4. Isolate the domain

As said before, DSLs will evolve over time. In the previousrecommendation we've seen that it is important to tie model andimplementation, the model should drive the implementation. However, todo so you need to isolate the domain. If the domain code, representingthe metamodel is diffused through the code it is very difficult to makechanges to it. Changes in the GUI of your modeling environment or theinfrastructure of your interpreter can actually change your domaincode.

In principle the recommendations for ‘normal' software also hold forDSL implementations. Divide your code into layers and concentrate allthe code related to the domain model in one layer which is isolatedfrom GUI and infrastructure code. The domain objects should be free ofthe responsibility of displaying themselves, storing themselves,managing application tasks, etc. They should focus on expressing thedomain model.

In the previous recommendation I stated that the model should drive theimplementation, and I meant to do that as literally as possible. Thisis possible if you isolate the domain! Using the Generation Gap Pattern you can generate all the domain code while isolating it from your other code.

So, isolate the domain to ensure that the model can evolve to be richenough to express the domain and to keep track of the changes in thatdomain.

5. Refactor continuously

Along the same lines you should refactor all the time. You shouldrefactor while you're knowledge crunching. You should refactor whileyou're communicating using the metamodel. You should refactor whileyou're busy with implementing the DSL. You should refactor while you'regenerating code from you metamodel. To say it with Eric Evans [4], youshould especially refactor if:

  • The design does not express the team's current understanding of the domain.
  • Important concepts are implicit in the design (and you see a way to make them explicit).
  • You see an opportunity to make some important part of the design suppler.

I think it doesn't need any explanation that such an approach needs close involvement of all team members including the domain experts.

6. Maintain metamodel integrity

To effectively abstract a complex domain with domain-specific models, you need more than one DSL. In complex projects multiple DSLsare usually necessary in order to cope with different concerns. Inother words: multiple domain-specific models (DSMs), specified indifferent DSLs are needed to accurately abstract complex systems.

Total unification of the metamodel (remember: the metamodel describesthe concepts of the DSL we are designing) for a large domain will notbe feasible or cost-effective. The most important reason for this isthat attempting to satisfy everyone with a single metamodel (and thus asingle language) will lead to complex options that make the languagedifficult to use. This is the reason we are designing a DSL at all!Different domain experts will have a need for their own domain specificlanguage to define their aspect of the system.

So, we need multiple domain specific languages, hence we also needmultiple metamodels. However, the boundaries and relationships betweendifferent metamodels need to be marked consciously. Somerecommendations on multi-DSL development:

  • Explicitly define the context for each metamodel, i.e. define the domain (e.g. system aspect) the DSL is designed for.
  • Continuously integrate the implementation of a metamodel and make the interfaces to other metamodels part of the automated tests.
  • Model the points of contact between the metamodels and use that model in your ubiquitous language. These points of contact define how models expressed in different DSLs can refer to each other. For example, a GUI element can refer to an element in the data model.
  • Think about your reference resolve strategy. If you use interpreters / engines to execute the models expressed in a DSL you can use late-binding, i.e. use soft references and resolve them at runtime. The advantage of this strategy is flexibility and adaptability. The approach usually used with code generation is early-binding, the references are explicitly reflected in the generated code. Performance can be a reason to follow this strategy.

7. Use a people-oriented approach

Executing a DSL implementation process, especially in a way asrecommended in the previous points, is not easy. It requires aneffective team of developers and domain experts. My last, and mostimportant, recommendation is to use a people-first approach in DSLdevelopment. DSL development is highly creative an professional work.Developers need to make the technical decisions, they are the bestpeople to decide how to conduct their technical work. Domain expertslive the domain, hence they are best suited to decide on theapplicability of the concepts of the language.

Although I strongly recommend the way of working reflected in theprevious six points, the team has to decide on the process. Accepting aprocess requires commitment, and as such needs the active involvementof all the team.

Key take aways for DSL development

  • Capture domain knowledge in a metamodel
  • Communicate using an ubiquitous language
  • Let the metamodel drive the implementation
  • Isolate the domain
  • Refactor continuously
  • Maintain metamodel integrity
  • Use a people-oriented approach

----------------------

[1] Marjan Mernik, Jan Heering, and Anthony M. Sloane. When and how to develop domain-specific languages. ACM Comput. Surv., 37(4):316-344, 2005.

[2] Eelco Visser. WebDSL: A case study in domain-specic language engineering.In R. Lammel, J. Saraiva, and J. Visser, editors, Generative andTransformational Techniques in Software Engineering (GTTSE 2007),Lecture Notes in Computer Science. Springer, 2008.

[3] Wile, D. S. 2004. Lessons learned from real DSL experiments. Sci. Comput. Program. 51, 265-290.

[4] Eric Evans, Domain Driven Design: Tackling Complexity in the Heart of Software. Addison-Wesley, 2004.

Photos by Gail S and Hélio Costa