![]() |
||
|
|
February/March 2006 |
Volume 46, Number 4 |
|
||||||||||
|
Subsetting and Customizing DITAThis article explores ideas related to subsetting and customizing the DITA (Darwin Information Typing Architecture) specification without the addition of new elements. Instead, we explore taking default rules and adapting them to meet the needs of specific writing and publishing environments. About DITADITA is an XML-based, end-to-end architecture for authoring, producing, and delivering technical information. This architecture consists of a set of design principles for creating “information-typed” modules at a topic level and for using that content in delivery modes such as online help and product support portals on the Web. Subsetting versus specializationTo help clarify the difference in meaning between subsetting and specialization, consider an example that relates to the use of the 26 characters of the traditional English alphabet; that is, the letters A through Z. Generally, the idea of subsetting is similar to that of removing letters of the alphabet. Consider a subset containing only the five principle vowels, AEIOU, and the letters RLSTN. Through the subset of these 10 letters we can create words like stone, lesson, rail, station, institutional, satin and so on. Each of these words is easily recognized as a well formed word by users of the full set of 26 letters. It is also recognized by users of the subset. However, the words frequency, subtle or railway would not be recognized by the subset. Therefore, subsetting, in the context of this document, means to remove or reorganize elements, attributes and attribute values to customize the way that options are presented, while ensuring all the DITA specifications are followed. Specialization is the general process by which additional elements are added to the DITA specification to allow for custom development of new information types. Why subset and modify DITAThere are numerous reasons to consider customizing the DITA specification either through subsetting or through specialization. The three main reasons to subset include changing default tags, modifying the order or elements and adjusting the frequency of element use. Default tag setThe default tags in the DITA specification include close to 200 elements. In many cases, tags provide features that are not required in your documentation or provide undesired redundancy. Many tags can be removed from the architecture and still leave all the structure authors need. This can be done without specialization. Default element orderThe order of elements in the DITA specification is incredibly flexible. This means that elements can be inserted in a variety of ways. The result is documents that allow writers the freedom to write. However, that freedom may result in writers skipping some elements or inserting others in an order that doesn’t adhere to your style guide. Modifications to the default element order allow restrictions to the organization of information. If this still adheres to the principles of the DITA specification your content remains compliant and your authors have a guided workflow. For example, within the step of a given task, the DITA specification allows numerous additional elements in any order. This includes:
Default element frequencyMany of the elements in the DITA specification allow child elements to appear with no restrictions. This means that, basically, authors can insert a wide variety of elements as often as desired. Ironically, this may result in undesired content, such as a step made up of a cmd, followed by an unlimited number of info child elements. Sample subsetting of a DITA elementAs a practical example of subsetting within the DITA specification, consider the step element. This element has numerous default child elements with few limitations placed upon them. By defining a subset of the step element we allow authors to create content while ensuring specific guidelines are followed. This ensures clear content is created within the DITA specification. All output is also fully compliant with the DITA specification. We therefore enforce a custom style of writing while following the DITA specification. Default rule of stepBefore we modify any elements, let’s begin by reviewing the default rules that the DITA specification enforces when working with the element step. The step element specifies: cmd then (info or substeps or tutorialinfo or stepxmp or choicetable or choices) (0 or more) then (stepresult) (optional). Therefore, using the default a writer can create the following type of content:
The step contains numerous elements; some of which repeat and appear in an order that may not be repeated the next time the element step is used. By developing a custom rule additional restrictions can be enforced for consistency within your organization. Custom rule for stepThe development of a custom rule should always be done with the confidence that output will still match the DITA specification. Basically, if the DITA specification allows an element to have numerous optional child elements, it is relatively simple to remove any of them. Since child elements are optional, removing them has no negative impact on output. If the child is required, then subsetting should not be done as the output will not meet the DITA specification. An example of a custom definition of step is seen below: cmd, (info, choices?)? Therefore, using a custom definition a writer can create the following type of content:
The step contains a required cmd element. After the cmd there is a single info and a choices element. Nothing beyond this limited subset is allowed. Authors can not insert examples, multiple strings for info, numerous choices and more. The restriction helps to ensure consistency and provides more detailed guidance for each of the authors when working with a step. Result of subsetting stepThe result of the customization is a document set that is more professional, consistent and easier to manage. Editing and translation are simplified as there are fewer decisions that need to be made based on writing style. Subsetting tagsThere are close to 200 tags in the DITA specification. One of the easiest things you can do to make a DITA implementation simpler is to reduce the number of tags. High level DITA elementsNumerous high level elements exist in the DITA specification and several can be safely removed when subsetting. It is important to first plan your document set and then begin to subset as the removal of high level elements and all associated child element is difficult to undo later. Also ensure that any element that is removed is not required elsewhere in the DITA specification. If it is, ensure you make appropriate modifications in all locations. Common attributesThere are also several attributes that are commonly used throughout the DITA specification that may not be required. As with elements, it is important to plan your document set and then begin to subset. Remember that many of the attributes are reused throughout the DITA specification and it may be better to remove them on an element by element basis rather than removing them from the DITA specification completely. Subsetting occurrence indicators and orderThe frequency of elements in the DITA specification can be subset. Since the majority of elements are optional, removing them poses no significant impact in the compliance of your content with the DITA specification. Subset stepAs seen earlier in this article, the element step can be subset as required. Another example of a subsetting of the default definition of the element step may appear as seen below: cmd, info? This new rule still matches the DITA specification. However, it has been customized to specify that a cmd must be inserted. Then, if required, info may be added, but only once. Drawbacks to subsettingThere are two key drawbacks to consider before subsetting: tab limitation and stricter rule requirements. If a DITA implementation is well planned neither should be a major problem in managing the way DITA is used. Tag limitationWhile subsetting helps to implement a stricter implementation of the DITA standard, it also deviates from it. By only supporting a key set of tags you restrict the ability to import other content that complies with the DITA specification. Stricter rule requirementsBy redefining the order of elements and their frequency, you effectively rule out some combinations of elements that others may use. In doing so, you may be limiting the usefulness of content that others provide that match the DITA specification. ConclusionSubsetting the DITA specification and modifying the default rules can provide many benefits to an organization. A restricted set of elements reduces the need to develop formatting and transformation rules for all possible combinations of elements. It also allows organizations to further control the types of content used and the way that they are used. This results in far more consistent documentation. As long as any subsetting and modification of the rules is done in such a way that compliance with the DITA specification is assured in your output, then subsetting can be beneficial. The key is to plan based on your current documentation environment and to also plan for any future implementations that are expected. Custom implementations of most XML architectures from DocBook to S1000D to the DITA specification happen all the time. By restricting tags and enforcing custom order your DITA implementation can be done quicker, with more reliable results and at a lower overall cost of development, training and implementation. Upcoming eventsThe author of this article is involved in several events in 2006.
Related materialsA variety of related materials can be found online, including a set of FrameMaker specific documents for developing and publishing DITA content using a custom subset:
Tools used to develop this articleAs the saying goes “we eat our own dog food”. In an effort to prove that DITA can be used to author content, and to deliver it in numerous formats, we created this entire article using readily available tools. Content was converted as required via numerous transforms provided with the DITA toolkit. Author informationA recognized publishing technologies expert, Bernard Aschwanden presents at conferences and events across Europe and North America. Bernard is an Adobe Certified Expert, a Certified Technical Trainer and the author of numerous publications on publishing and single sourcing including Advanced FrameMaker, published by TIPS Publishing. The founder of Publishing Smarter, a senior member of the Society for Technical Communication, the Vice President of the Toronto STC and Past President of the Computer Trainers Network, Bernard has helped hundreds of companies implement successful publishing solutions. Bernard is focused on publishing better, publishing faster and publishing smarter. Home Page: http://www.publishingsmarter.com Email: dita@publishingsmarter.com ![]() |
|||||||||
|
||||||||||
|
© Copyright 2005 |
||||||||||