When Management decides to go to an XML publishing system, it's not always clear to the authoring and editing staff why the decision was made. Ever wonder why they choose to go this route?
In this presentation, we talk about some of the reasons why you might go away from desktop publishing systems to something more dynamic, so that you not only understand it better but can get excited about the change.
TC Dojo Expert-In-Residence
Liz Fraley, Single-Sourcing Solutions, is a serial entrepreneur. She’s founded two companies, sits on the boards of three non-profits, and is constantly coming up with new ways to share knowledge in the technical communications and content industries. She has worked in high-tech and government sectors, at companies of all different sizes (from startups to huge enterprises). She advocates approaches that directly improve organizational efficiency, productivity, and interoperability. If you ask her, she’ll say she’s happiest when those around her are successful.
Watch the Video
Recorded: July 2013
Transcript (Expand to View)
[00:00:01.750] - Liz Fraley
All right, good morning. So we're going to talk a little bit about Why go to XML. This is a presentation I gave to the STC Silicon Valley a long time ago. And it was a session that people who missed were like, hey, hey, are you going to have her back?
[00:00:22.140] - Liz Fraley
Because this is stuff that usually managers see and that the writers who are tasked with doing XML implementation don't always get to see. And so it's not always clear why they've made those choices. And so we're going to share that stuff with you.
[00:00:36.660] - Liz Fraley
So let's real quick talk about what XML is and why it's important. And this is the stuff you typically see, consistent information when you change the XML file, if you've got data content that's gone rafter, information that's referenced in multiple books, you change it in one place, you change it everywhere, and multiple outputs.
[00:00:58.230] - Liz Fraley
The separation of form and content is pretty much the canonical case for XML. You don't spend a lot of time formatting, you write your stories and then you let the formatting take care of itself. And that's all things that reduce cost and improve productivity.
[00:01:15.160] - Liz Fraley
The big one here is down toward the bottom of the list, actually, which is really weird that it's down there. But it's future proofing your information. I personally still have Word for Windows documents that I can't open anymore. It doesn't help any that it's on a three and a half floppy disk. But even if I could have a drive that read the disk, I wouldn't be able to open the documents.
[00:01:39.100] - Liz Fraley
And this is not uncommon to find in organisations that have products that last more than a couple of years. I think some time ago, I was part of one of the military contractors and they had told me that one of the big airplane projects was entering its 50th 5-0 year in operation.
[00:02:01.030] - Liz Fraley
So those manuals you know have gone through multiple data conversion cost projects because 50 years ago we didn't have what we have today. We certainly didn't have WordPerfect or FrameMaker. So just guaranteeing that you can open those documents that were created a long time ago, it's even more important for long term projects, so that's one of the bottom.
[00:02:24.220] - Liz Fraley
Being able to create and exchange information, that is becoming a bigger deal than it used to be. When you've got working with multiple outsource teams or you've subbed out part of your work or you're using components from other companies or your OEMing your documentation, it's more important that you can give your content to somebody else and they can open and do things with it, regardless of which side you're on, the receiving or the sending end.
[00:02:55.140] - Liz Fraley
So what a simple, really simple basis. This is what managers see when they're learning about this thing. This is an example you typically see to executives because they don't really know the day to day work of technical publications people. And so this is one of the things that they'll see.
[00:03:14.100] - Liz Fraley
When you do things in a word processing application, your desktop publishing app, your formatting, and that's how you are indicating something is different from the rest of the content. So your title is bold, bigger font, maybe a different font, and its justification is different, centered or justified. That's how it's important or a title. It's like a pigeon language way of marking out something that's important.
[00:03:42.450] - Liz Fraley
In XML, you're identifying things for what they are. A title is a title. So you can do things appropriate to it. This is in a lot of times you'll hear this talked about as metadata, about what it is that you're your content is about. So this is this is really what you see.
[00:04:00.420] - Liz Fraley
One of the benefits for XML is that you actually can identify a phone number is a phone number and address is an address or a short description or summary exactly as what it is so that it can be treated differently because it is identified as being something more than just formatting.
[00:04:16.490] - Liz Fraley
All right, these are the basics. In traditional desktop publishing, what you typically have is binary storage. It's proprietary. It's readable only by that vendor's application. You can only open Windows Word for workgroups in that application. Can't even open it in Word 2010.
[00:04:37.310] - Liz Fraley
WordPerfect opens in WordPerfect. FrameMaker opens in FrameMaker. If you're in DITA or DocBook or any other XML document, it's really just an ASCII file. It's text. You can open it in any XML-aware application, XMetal, FrameMaker, if you're using the structure FrameMaker or even the TCS4. You can open a DITA document straight out from any other application, Oxygen, Arbortext, any of them. And it's always there. It's always available.
[00:05:07.340] - Liz Fraley
And in XML, if you have something you need that is specific to your business requirements, you need to say change the DOCTYPE a little bit. You need it to represent your information as opposed to the general software world of of IBM.
[00:05:24.170] - Liz Fraley
Let's say you're a medical device company and you have an FDA filing requirement. You want to identify your content as the things that they require it be marked up as a device, as a device with certain quality assurance attachments or risk identified specific items known as risks.
[00:05:51.680] - Liz Fraley
And if you've just got things marked as note, and it's really a risk, it's hard to know that every note in your document is a risk because it's not really. Risks are a very specific kind of thing that the FDA requires. It's part of your 10 filing. It's different.
[00:06:12.180] - Liz Fraley
So if you can identify it as a risk, then you can do things with it specific for that content. If you're in binary storage or traditional desktop publishing, and you're trying to do a specialization or customise a DOCTYPE, typically you have to do it within that tools capabilitie. For example, in Word, you have to work at the word template. In FrameMaker, you often have to work with the EDD, which is a different programming language all of its own.
[00:06:43.780] - Liz Fraley
So in XML you can use XML itself, what you already know how to do to create the specialization. There's no secondary tool or special way of doing things. It is all part of what you see as part of what you've learned in XML.
[00:06:59.920] - Liz Fraley
So from a storage point of view, you get a bump to going to XML. From a maintenance perspective, when you're working with desktop publishing applications, you have all your content typically in one big document or in a couple of big documents that are sandwiched together post-Processing.
[00:07:20.980] - Liz Fraley
I remember having 120-page document in Word. And about that point, the TOC would stop working all that well if I had autogenerated it. And you couldn't have parts of that Word document pulled out and then pulled in. It's not necessarily so true some more in 2010, but I haven't really tested the actual inclusion in that.
[00:07:46.600] - Liz Fraley
But this is not this is not something that's unexpected. Even if you're using traditional FrameMaker, not the structure frame, you have the same problem. Each chapter is typically in its own file. And those are all bound together by a book file. And that's done. It's combined into PDF or web output at processing time.
[00:08:08.860] - Liz Fraley
You don't usually have. Parts of a chapter broken up into individual files. So you've got to do a lot of maintenance issues because you can't have the individual content common information sets. You don't have the automated styling you typically would find in XML. And you don't get the metadata.
[00:08:35.350] - Liz Fraley
In traditional DTP, your metadata is the styling itself, possibly the character paragraph markup. But that requires that every author has absolutely adhered to using it. The first time someone hits a bold key rather than using the bold character mark up, you've already missed the consistent markup and you're already dealing with maintenance issues. And we all know that no Word document comes back unchanged.
[00:09:03.670] - Liz Fraley
All right, so this is the IBM goal. This is back in 2007. Lisa Fisher at IBM said, here's our goal. It's the right content, right person, right time, right format, right media. And later on, she added, right language. And if our goal is to do some or part of this multifactored way of delivering content, then desktop publishing starts to hit its boundaries.
[00:09:35.100] - Liz Fraley
So here's what we're looking at. Typically, if you're looking at right time, right person, right place, right format, da-da-da, you've got a bunch of topics. You've got a bunch of content on the left hand side that are mixed, matched, reconfigured, reorganised, shuffled, put together, reconfigured based on some configuration requirement, whether you've got internal versus external information.
[00:10:00.450] - Liz Fraley
A lot of companies will create internal documents and there's stuff that they want to drop out of those documents when it goes to the customer. But they really want it there for technical support or field service or even their internal development team.
[00:10:14.670] - Liz Fraley
So you're looking at configuration as a way of changing audience and including or excluding information, and then you're mixing and matching again to deliver to a specific output requirement. And maybe not everything goes to the CD or goes to the printed book or goes to the web or goes to the device.
[00:10:31.500] - Liz Fraley
Maybe you just have the context sensitive stuff that goes on the machine down here, but you want everything in the book, and maybe the book shifts with the machine and maybe it doesn't. Maybe you're just shipping a get started QuickStart guide or poster with the product and you're not doing that with the rest of the book.
[00:10:51.120] - Liz Fraley
So if this is the issue, you really need to be able to split your content into reusable parts, just like screws, nuts, and bolts. You want to have pieces you can use and configurations that let you do what you need to do.
[00:11:06.430] - Liz Fraley
All right, so I'm going to show you three slides that have stuck with me since 2002, and that's saying something as 2013 now. At an Arbortext conference back in 2002, there were three slides that's really demonstrated it for me. And this is the first one.
[00:11:24.400] - Liz Fraley
You see a piece, the content master on the left hand side. And the yellow block is a reuseable piece that used in five different books. It appears in five different places, but reused in all of those places. So originally, it was all in traditional desktop publishing. And if they change that one piece, they'd have to change it in all the other documents.
[00:11:46.510] - Liz Fraley
Also, the problem get compounded for them. They not only went to five documents, but all five documents went to three different output formats. So now you see that yellow block duplicated magnificently. But then it gets a little worse. Whether it's language, or version, or product line, or some other dimension, your content explodes the more you have it.
[00:12:08.880] - Liz Fraley
The more configuration, the more dimensions you're going to, the harder it becomes to manage if you don't have a piece you can change just in the master and have it show up in all of the other places. So it's no wonder these three sides stuck with me, because really this is a great demonstration of how many places you have to fix something if you're working in desktop publishing.
[00:12:33.380] - Liz Fraley
So now I've got six other slides. And these come from 2008, Flatiron Solutions, Eric Severson, CTO, there who I've known a long time, DITA best processes for DITA presentation. And I pulled out six slides from his presentation. What that show numbers you don't typically see.
[00:12:55.430] - Liz Fraley
So first one is it's very similar to the one we just saw from Hamilton Sundstrand. You've got a document with some specific stuff and some common stuff and a way to configure the Linux Manual on the left or the Windows Manual on the right.
[00:13:09.140] - Liz Fraley
The rest of it's the same. Just that one piece drops in or drops out. It's configured to how it works. So we're talking about the same situation and we're talking about it now eight years later.
[00:13:23.560] - Liz Fraley
So here's the one that's really good. It's easy and hard to understand how the side works. If you look at it, there are four different documents. This company produced four different documents were basically the same stuff was in each document, but it wasn't exactly the same.
[00:13:41.080] - Liz Fraley
In this one it says, postal code is required. This one says postal code is a required field. This one says postal code is a mandatory field. This one says postal code must be present. So we've got four different statements that are all virtually equivalentt.
[00:13:58.000] - Liz Fraley
But if we're taking this to translation, every last one of them will be translated. You're going to have the same essential paragraph translated four times. You're going to pay for it four times.
[00:14:09.520] - Liz Fraley
So what they did was, is they said, okay, well, let's look at how these things normalise. We always get the postal code denotes specific mail region at the top, and we always have required field at the bottom. Let's standardise and choose these two statements for every time we do it.
[00:14:27.100] - Liz Fraley
These three, you get one, either US, international, or applicability for a specific product. Because in this one, we see US in product A. This one, we see applicable country in product A. This one, we say US but no product A. This says country, but no product A. So you get one of these two and maybe this one.
[00:14:47.000] - Liz Fraley
Okay, so here's where if you look at it, just changing that part, just those two part, the first and the second one, we're reducing translation by 75 percent. And I say 75 percent because before we were translating it four times the same postal code statement. Now we're translating that statement one time, just that one statement. And that's a reduction of 75 percent. And it's amazing how much that actually translates to.
[00:15:21.370] - Liz Fraley
Here's the hard numbers that they shared with us. When you had contact reuse, and they were looking at 75 percent. We have customers who were at 85. That's the highest I've seen, really. You're looking at total cost of word count is reduced, because you're doing reduce, your language will stay the same, your author and costs are lower because you're not doing the same, you're not writing it four times, or you're not editing it four times, you're not fixing it four times.
[00:15:50.790] - Liz Fraley
Your translation costs go down 75 percent. Project management, about the same, you're doing a little bit, maybe a little bit less. Editing costs are less because you're not editing all four copies. You're editing the one copy and trusting that it's used all the places that it's supposed to be used.
[00:16:07.030] - Liz Fraley
Production costs go down to zero because now it's all automated. And you're looking at a total cost savings of $300,000, almost $400,000. And in an environment where every company is trying to save money and they're cutting staff to do it, you're looking at three or four jobs.
[00:16:27.500] - Liz Fraley
I mean, it's a significant thing for something as small as postal code, the implications and the impact it has is really, really big. And this was based on a study of their customers over the last 10 years.
[00:16:41.840] - Liz Fraley
So one real quick another example. Let's say we're changing one term. And when you're doing translation, when you change a term, you're typically changing the entire sentence.
[00:16:55.490] - Liz Fraley
If you're using a terminology database, which XML kinds of things will let you do or localisation memory will let you do, yet that also does the same thing is you don't have to change all the languages. You just change the term and then that's it's populated through everywhere else. So if nothing else, this thing certainly helps reduce cost.
[00:17:18.080] - Liz Fraley
So here's their numbers based on costs for correcting terms. And again, you're seeing 2,000 to a hundred. You're seeing a factor of 10 reduction, 42,000 to 2,000, that's more than a factor of 10. You're looking at five percent of what it used to be. And when you're looking at that, this is several months of staff salary. It can really be a big deal.
[00:17:44.960] - Liz Fraley
All right, so we're getting close to the half hour and I want to have at least a little couple of time for questions. So let's talk about the few things to remember. If those numbers don't stick with you, I don't know what else will happen.
[00:17:57.350] - Liz Fraley
Remember, work is never done. Your content always changes. There are errors. There are changes. There's new features. There's new products you acquired by or acquire a new company. It happens all the time.
[00:18:08.180] - Liz Fraley
Even if you figure out all your content today, it will be different tomorrow. Maintenance is a big issue and has a lot of costs associated with it. So everywhere that you can adjust that, you have places for saving and even possibly increasing staff.
[00:18:26.800] - Liz Fraley
And changes to those things trigger more change. If you've got a new feature or a new product or a new company, you're going to change your data model change. There are things that they were doing that you didn't account for, didn't think about, and didn't want to have to do. It's less true now if you've got DITA unless you're doing some specializations.
[00:18:46.540] - Liz Fraley
Today it makes it a little easier to change your data model over time. Metadata will change for sure, new product, new feature, or new company. These are things that didn't exist.
[00:18:56.800] - Liz Fraley
Who knew the iPad would exist 10, 15 years ago and was writing content that would allow for that? Nobody. It'll change all the time. So you want to be able to make sure that you can accommodate that.
[00:19:09.640] - Liz Fraley
New features, new products, new companies certainly, may change stylesheets. Everybody knows marketing gets it in their heads once in a while to just modernize the website, bring it up, and make it look fresh and new and comfortable, that changes your layout typically for documentation, certainly any online help, any visible help. So all that stuff will change, too.
[00:19:34.810] - Liz Fraley
Processes will change when you've got new people and new things going on, who gets approvals, and how things move through, and when it is permitted to be released. One of our topics coming up is about how to do approval systems and whether you want to separately life-cycle your output formats. And so we're going to talk a lot about how process change triggers more change.
[00:20:02.440] - Liz Fraley
And languages, nobody's doing 50 languages today in their output formats. I know some people doing 25. But most people do one and then five and then 10 and then three more little at a time. So your supported languages will change, which will change style sheets, which will change processes, which will change metadata potentially. And output formats, again, who did the iPad 10 years ago? None of us.
[00:20:30.550] - Liz Fraley
Okay, so rather than being afraid of XML, it's often a good way to look at it providing opportunities to reduce work and increase efficiency. Every one of us is doing things to increase our efficiency or reduce our work.
[00:20:43.780] - Liz Fraley
Whether it's keeping a text file on your desktop of things that you commonly copy and paste, whether it's keeping a spreadsheet of your tasks and your tickets and what state is in and who you need to go talk to versus for a particular features, we all do it.
[00:20:59.200] - Liz Fraley
Whether or not we share those mechanisms with other people on our team or with other people in our working group, typically you don't see that. Luckily, if you've got a content management system that supports workflow and task tracking and things like that, then you might be able to do some of that.
[00:21:23.140] - Liz Fraley
There is some some learning curve when you're adopting a new tool. But sometimes, maybe that can help, maybe it doesn't. So the big three are, of course, separation of form and content, reuse and repurposing of content, and supporting configuration for content.
[00:21:42.430] - Liz Fraley
You want to be able to configure for a customer, different audience, different skill level, different language, internal customers versus external customers. You want to be able to support your authoring and publication team, not just creating stuff, but being able to assemble, disassemble and reorganise books, brand new book. If you come up with a new one, how long does that take to configure and figure out what's written and what isn't rather than just cloning and modifying the entire book?
[00:22:10.600] - Liz Fraley
And then all across the enterprise, looking for opportunities to help marketing include accurate information in their white papers. They're going to copy and paste it from you anyway, so how can you preemptively give it to them in a way that they can use? There's all kinds of places for this, customer support, training, legal department, all that stuff, they can QA and requirements analysis, those guys.
[00:22:43.500] - Liz Fraley
So any one of these things. These are your big three things to remember. These are the things that XML typically gives you. And now you've seen the numbers and know what the proposition is and why they do it. And hopefully, that's been useful for everybody.
[00:23:01.490] - Liz Fraley
Are there questions, Janice?
[00:23:05.250] - Janice
If you have a question for the presenter, feel free to type it into the question box. We have no questions at this time. We still have a few moments, so we'll give them a couple minutes-
[00:23:19.090] - Liz Fraley
Well, we'll let the questions come in because I know sometimes it takes a while to type it in. And I'm a quick talker. I know that, too. Next session will be in August. It's also a Monday. We're going to do a little bit more detail of this presentation and talk about reuse strategies, talk about repurposing content and how you do it and what you look at within a couple of examples.
[00:23:43.060] - Liz Fraley
You've already seen how translation costs drop. But we're going to talk about it a little bit more in detail next time. That's the registration link on the bottom. Alsom for anybody who's part of the Master Series, the Master Series starts meeting next Monday.
[00:23:59.380] - Liz Fraley
These are more single conversations, not presentations. So anyone who's in attendance gets to present their issues and we all work together like your board of advisors and help you figure out how to solve your problems. That's the Master Series next Monday.
[00:24:18.340] - Liz Fraley
All right, so we hope to see you next time. And don't forget, in the meantime, to vote for your choice of session topics coming up. There will be some delay between the voting and the scheduling. We need to find a presenter and we need to get their presentation together.
[00:24:32.830] - Liz Fraley
But we're certainly already getting votes for what are interesting to talk about. And please send other topics suggestions in.
[00:24:42.760] - Janice
Yeah, if you have suggestions of things that you'd like to hear discussed at the TC Dojo, please let us know. If you have a presenter that you would like to hear from during this TC Dojo, you can also let us know that as well. All right, so we'll see you all in August. Everyone, have a great day.
View the Slides
You might also be interested in...
About the TC Dojo
At the TC Dojo, you pick the topics and we find the experts.
You can’t ask questions of a video, so be sure to join the TC Dojo and never miss attending live: http://join.tcdojo.org
Vote on future TC Dojo webinar topics here: http://survey.tcdojo.org