solshare.net
Sign in or Join. Username:   Password:   (forgot password?)     Submit

MS Public Sector Team Blog

Browse by Tags

All Tags » Document Formats   (RSS)

  • OpenXML and ODF - it's not a zero sum game.

    Jean Goffinet of Clever Age has a post intriguingly titled "Are we traitors or mercenaries?" on the ODF Converter team blog describing a lunch appointment with two "emissaries" from OpenOffice.org and OASIS. He explains how after the project was announced he reached out to both organisations, as he says :-

    After the project launching, I had contacted both organizations: the first one to ask for contributors (our plug-in is designed to allow its use by other applications, so we thought that OpenOffice.org could be interested in joining the team to make its product open and save docx files) and the second to have its agreement to use OASIS logo in our plug-in (agreement that we never had, so we decided to use another picture).

    Sadly he then goes on to say

    The two emissaries wanted to tell us how we were seen by OpenOffice.org and OASIS, and also to ask us some questions regarding our position. So we learnt that the OpenOffice.org team considered us as "traitors" (who are those traitors that work for the Big Satan Microsoft?) and that people from OASIS "did not like us" (even if they don't really care about our little company). In fact that was not a definitive judgement: they wanted to know exactly in which camp we were - because in this war, you cannot remain independant, you have to choose your side. So by default, as we work for Microsoft, we are seen as enemies; but if we show good willing and for instance join the OpenDocument consortium, that could be a sign that we may on the contrary be friends.

    If you are interested in document formats, in OpenXML, ODF, or interoperability you should read the rest of his post. For my part I don't think that this is a zero sum game. Happily more and more of the policy makers I am interacting with are reaching this same realisation too.

  • Gary Edwards on ODF Interop Extensions (aka ODF1.2 iX)

    Gary Edwards of the OpenDocument Foundation stopped by the other dayto comment on my post "Spinning out of control" in which I commented that the ISO press release about ODF made claims which I didn't believe could be truthful. Specifically that

    Billions of existing office documents will be able to be converted to the [ODF] XML standard format with no loss of data, formatting, properties, or capabilities.

    I represent Microsoft at all kinds of meetings and my firm understanding is that one of the things that differentiates OpenXML and ODF is OpenXML's ability to faithfully represent all of the previously created Microsoft Office binary format documents. Gary was pretty forthright in his feedback and told me, amongst a few other things :-)

    "You're wrong.  The OpenDocument Foundation plug-in will deliver near perfect fidelity for ODF documents produced by MSOffice."

    Now I don't for one second mind being wrong, but I like to know where or how I'm wrong. I therefore called Brian Jones and asked him to take a look at the post and Gary's response. Brian did this and asked Gary for more information, and later posted "Spin spin sugar" over on his blog.

    Gary's now commented there too, but I am confused by what he says:-

    • He talks about an ODF "interop eXtensions" proposal that's "been quietly making the rounds of OASIS ODF TC members".  
    • He says that the "iX proposal itself comes out of our work in the Massachusetts RFi trials, where high fidelity roundtripping of ODF files is a priority".
    • He mentions "ODF 1.2 iX ready applications".

    There's a bunch of other stuff too, and you should check out Brian's post and Gary's comment to get the full picture. I'm confused because Gary's comment has left me with more questions and no answers :-

    • My reading of what he says is that the ISO Press Release is certainly wrong to claim that "Billions of existing office documents will be able to be converted to the XML standard format with no loss of data, formatting, properties, or capabilities" at least as far as ODF 1.0, to which it was referring, is concerned.
    • The only public reference I can find to Gary's "Interoperability Extensions/iX" phrase is actually an email from Gary himself that, intriguingly, refers to the OpenForum Europe Meeting I posted about here.  I say intriguingly because Gary seems to suggest that there was a "private briefing" to the IDABC Experts Group.
    • The reference to "ODF 1.2 iX" suggests to me that this is part of ODF1.2 which I understood to be slated for late 2007. I must have got this wrong though because Gary says it will be ready by January 2007.

    I am left with an even stronger sense that the spinning is out of control. In any case, I hope ISO correct their press release.

  • Back to the future ... Lotus' @ formula language and ODF

    Interesting development on the ODF formula front. IBM has donated the end-user 1-2-3 @function reference help file from IBM Lotus SmartSuite 9.8 (under Section 5 of OASIS' IPR policy) to speed up the work.

    I used to work for Lotus. I remember when IBM stopped investing in SmartSuite because "the desktop wasn't strategic" many of us were very disappointed with the Pythonesque "run away" strategy. Nevertheless, SmartSuite is still listed on IBM's web site with a buy online button next to it.

    I wonder whether there are any plans to provide an update to allow all of the users to use ODF, or OpenXML for that matter? After all, IBM claimed 28 million users in 2000, many in government.  It would be cynical to think IBM was only going to implement ODF support in new products wouldn't it? Especially given the importance IBM recognises customers place in accessing their documents in the future. [update: it seems it's not just me wondering about this]

    I smiled at the irony that, given what IBM is telling its customers, the document was converted into Word format in order to be saved as .ODT.

  • Spinning out of control.

    I’ve just been on the phone with a journalist. He disagreed when I said that a differentiator for OpenXML over ODF was the care being taken to ensure OpenXML will be able to faithfully represent the hundreds of millions, if not billions of existing documents saved in Microsoft Office’s binary formats. He pointed me to the OpenDocument press release on ISO's web site. I notice that it states (my emphasis):-

    Billions of existing office documents will be able to be converted to the XML standard format with no loss of data, formatting, properties, or capabilities. This will facilitate document contents access, search, use, integration and development in new and innovative ways.

    I wonder which “billions of existing office documents” can it be referring to? Of course they're documents saved in Microsoft's Office binary file formats (since they and PDF are the only formats that have been used for anything like such a number of documents). The problem is this claim is totally untrue. ODF cannot even approximate the claimed fidelity for even a fraction of these documents.

    I've posted previously on the politics behind ODF's standardisation, and I am planning to blog more soon about how the commercial interests that are promoting ODF continue to use politics behind the scenes against their own customers' interests. It's sad to see this kind of untruthful spin and manipulation happening. It is damaging the credibility of OASIS and its technical Committees, and ultimately ISO's too.
  • Cutting corners - the realpolitik of ODF standardisation?

    I made a comment at an event in Brussels and quite a few people have asked for more information so I thought I'd post here.

    I mentioned that in my opinion, Sun were completely aware that ODF wasn't sufficiently defined to support spreadsheet interoperability as long ago as February 2005 and that the realpolitik inside OASIS was to take advantage of the EU IDA's request to standardise by rushing to be first despite knowing the ODF specification was deficient in at least this area.

    So, here's how I reached this opinion. As I said, I'm delighted to be corrected if this proves to be factually inaccurate. This isn't intended to be FUD, I am only quoting OASIS' own mailing list.

    Back in early February 2005 James Clark made a comment to the OASIS OpenDocument technical Committee about the lack of interoperability for spreadsheet documents (my emphasis)

    I really hope I'm missing something, because, frankly, I'm speechless.  You cannot be serious. You have virtually zero interoperability for spreadsheet documents. OpenDocument has the potential to be extraodinarily valuable and important standard. I urge you not to throw away a huge part of that potential by leaving such a gaping hole in your specification.

    Claus Agerskov further commented that this provided a means of creating lock-in (my emphasis)

    "OpenDocument doesn't specify the formulars used in spreadsheets so every spreadsheet vendor can implement formulars in their own way without being an open standard. This way a vendor can create lock-in to their spreadsheets"

    SUN's Chair of the Technical Committee, Michael Bauer, responded that

    "There are from our point of view also no interoperability issues, because the namespace prefix mechanism we have specified unambiguously specifies what syntax and semantics are used for a formula".

    which David Wheeler interpreted as saying 

    "Every implementation must reverse engineer all other implementations' namespaces (they're not in the spec, so everyone's free to invent their own private incompatible namespaces). Then, every implementation must implement all the syntax and semantics of all other implementations' namespaces for formulas, if they wish to achive interoperability. And oh, by the way, your implementation might not implement the namespace for the document you're trying to load, so you may lose all the formulas."

    David made some proposals which I believe have now led to the OASIS Open Document -Formula subcommittee, but at the time the OASIS Technical Committee answered David saying amongst other things that: (my emphasis)

    "... please keep in mind that we have to fit proposed solutions into the politic of work that has already been done.  A politic that represents years of work that is just now on it's way to ratification at OASIS, and beyond to ISO.  Keep in mind also that the ISO certification comes at the request of the European Union. Time is of the essence. Ratification perhaps trumps perfection. At least for the moment. We are very much aware that whatever we leave outside the specification remains open (or not) and exposed to ambiguities and custom implementations, all of which have proved to be so problematic in the past."

    Remember this was back in February 2005. In September 2005 Newsforge carried an artlicle euphamistically titled "OpenDocument office suites lack formula compatibility" but which did a good job of demonstrating that OpenDocument represented a step backwards from the status-quo and included some screen-shots to illustrate the problem, concluding

    Sooner or later, a solution to the formula incompatibility problem will be found. Ideally, someone will solve it sooner, and without any intellectual property problems. The capability to exchange spreadsheets freely is essential for free desktops in the business and education markets, and it should not be limited by artificial restrictions.

    Now when Tim Bray (also with Sun) heard of this in October 2005 he wrote

    Bad Formula Trouble · I learned, to my dismay, that the ODF specification is silent on spreadsheet formulas, they’re just strings. This is obviously a problem; much discussion on what to do ensued. I lean to the idea, much bally-hooed by Novell, of simply figuring out what Excel does, writing that down, and building it into ODF v.Next. Mind you, anyone who’s really been to the mat with Excel, in terms of Math & Macros, knows that it isn’t a pretty picture, there are real coherency problems. But it’s good enough and the world has learned how to make it work.

    I don't know if its correct, but the roadmap on OASIS' web site seems to suggest that this might be fixed in ODF 1.2, but it seems that's not due before October 2007 at the earliest.

    Remember all of this was known back in very early 2005, pre-ISO certification. Since ODF became ISO26300 OASIS has been working hard to promote its adoption. A prime tactic has been to create the impression with public policy makers that the longevity of public records is in danger unless they mandate ODF.

    Somehow they neglect to mention that the spreadsheets can't interoperate though.

  • Open Secrets

    Oooh, how mysterious. I was listening to the Redmonk Podcast with Simon Phipps about yesterday's announcement of the ODF Translator project. Towards the end Simon is asked to talk about the ODF Foundation plug-in he saw demonstrated in Brussels on Tuesday.

    Simon explains that he was at a private event held under "Chatham House Rules" and that he'll "need to check with the people who demonstrated there to find out whether he can actually say what he saw". I understood The Chatham House Rule allowed you to use/discuss the information you'd gained but shouldn't identify/attribute participants. The Chatham House web site says the rule "originated with the aim of providing anonymity to speakers and to encourage openness and the sharing of information".

    Now, by coincidence, OpenForum Europe's press release cautiously welcoming Microsoft's announcement yesterday also mentions that "This week OFE, in partnership with the ODF Alliance has been running briefings in Europe to senior government officials on the impact of ODF, and were able to see one live demonstration of the plug-ins, similar to that now being proposed by Microsoft."

    Could these be one and the same? If so why is Simon so circumspect in discussing what happened on Tuesday? There's no mention of these meetings on OpenForumEurope's web site or the ODF Alliance's. 

    All very mysterious - it appeals to the Mulder in me. Why so much secrecy? What's being hidden? Who attended Tuesday's "private meeting" I wonder.

    On reflection it seems unlikely it would be the "senior government officials" that OFE/ODFA were lobbying. Whoever it was they clearly weren't football fans. The truth is out there.

  • ODF Add-In - Screenshots

    The tool adds an ODF option to the File Menu as shown ...

    I then pulled an ODT file from Massachusetts (they're not that easy to find). It downloads as .zip not .odt so I had to rename it and then opened it . This flashes up briefly ....

     

    and then, by the magic of software ...

  • A Foundation for the New World of Documents - An open letter from Chris Caposella

    Chris Caposella (or ChrisCap as he is known inside Microsoft), has written an open letter giving Microsoft’s perspective on the future of document formats.

    I've chosen to highlight this section (my links and emphasis):-

    Supporting Multiple File Formats

    Today, Microsoft Office supports multiple document formats including standards-based HTML and Rich Text Format (RTF). With the 2007 Microsoft Office release, we’ve added support for publishing in the PDF and XML Paper Specification (XPS) formats through free downloadable add-ins.

    In addition, we’ve recently announced support for an open source project to create a format translation tool between Open XML and the OpenDocument Format (ODF). This translation tool will also be available via a free download. Although file translation may not result in perfect document fidelity because of format and product differences, it is the most effective way to offer interoperability in a world where multiple file formats will need to coexist.

    Microsoft has also made extensive investments to make it easy for customers using older versions of Microsoft Office to take advantage of the new Open XML formats. Office 2000, Office XP and Office 2003 users can update their products free of charge through compatibility packs and tools, so that they will not be required to upgrade to get the benefits of using the new Open XML formats.

    Meeting the Needs of Public Sector Organizations

    [omitted para]

    Microsoft believes that public sector organizations have a lot to gain from the rapid evolution to open, XML-based documents. We encourage public sector organizations to move to XML file formats but not to mandate a particular format or implementation. There will be many different XML formats around the world, and organizations should be able to pick the right one for them based on the principles of choice, competition, interoperability and the value delivered for each project. We believe strongly that public sector organizations should keep their options open in the fast-paced area of XML innovation, with vendor-neutral purchasing policies that enable agencies to choose the most appropriate technology for their needs, while establishing guidelines for interoperability.

    I think this is good news for Public Sector customers who have come under some fairly intense lobbying from the ODF community. Doubtless Sun and IBM will find cause for complaint, but their arguments are becoming increasingly disingenuous.

  • OpenXML code snippets

    I just spotted a great post on the TechEd Bloggers feed from Erika Ehrli - "Open XML File Formats: What is it, and how can I get started?"

    Erika points out some interesting resources for thise wanting to learn more about the Open XML File Formats:

    Open XML Snippets

    • Open XML: Get OfficeDocument Part: Given an Open XML file, retrieve the part with the http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument relationship type.

    Microsoft Office Excel Snippets

    • Excel: Add Custom UI: This snippet adds a custom UI Ribbon part to a given workbook.
    • Excel: Delete Comments by a specific User: This snippet deletes all comments from a given user from a given workbook.
    • Excel: Delete Worksheet: This snippet deletes the specified worksheet from within a given workbook and resets the selected worksheet to the next one on the list. Returns true if successful, false if failure.
    • Excel: Delete Excel 4.0 Macro sheets: This snippet deletes all the Excel 4.0 Macro (XLM) sheets from a given workbook.
    • Excel: Retrieve hidden rows or columns: This snippet returns a list of hidden row numbers or column names from a given workbook and worksheet.
    • Excel: Export Chart: Given a workbook and title of a chart, this snippet exports the chart as a Chart (.crtx) file.
    • Excel: Get Cell Value: Given a workbook, worksheet and cell address, this snippet returns the value of the cell as a string.
    • Excel: Get Comments as XML: Given a workbook, this snippet returns all the comments as an XmlDocument.
    • Excel: Get Hidden Worksheets: This snippet returns a list containing the name and type of all hidden sheets in a given workbook.
    • Excel: Get Worksheet Information: This snippet returns a list containing the name and type of all sheets in a given workbook.
    • Excel: Get Cell for Reading: Given a workbook, worksheet and cell address, this snippet demonstrates how to navigate to the cell to retrieve its contents. The cell must exist for the function to find it.
    • Excel: Get Cell for Writing: Given a workbook, worksheet and cell address, this snippet demonstrates how to navigate to the cell to set its value. If the cell does not exist, the snippet creates it.
    • Excel: Insert Custom XML: Given a workbook and a custom XML value, this snippet inserts the custom XML into the workbook.
    • Excel: Insert Header or Footer: Given a workbook, worksheet and text to insert and a header or footer type, this snippet inserts the header or footer with the given text into the worksheet.
    • Excel: Insert a Numeric Value into a Cell: Given a workbook, worksheet, cell address and numeric value, this snippet inserts the value into the cell.
    • Excel: Insert a String Value into a Cell: Given a workbook, worksheet, cell address and string value, this snippet inserts the value into the cell.
    • Excel: Set Recalc Option: Given a workbook and a RecalcOption, this snippet sets the recalculation property to the new option.

    Microsoft Office PowerPoint Snippets

    • PowerPoint: Delete Comments by User: Given a presentation and a user name, this snippet deletes all comments by that user.
    • PowerPoint: Delete Slide by Title: Given a presentation and slide title, this snippet deletes the first instance of a slide with that title (titles are not unique).
    • PowerPoint: Get Slide Count: This snippet returns the number of slides in a given presentation.
    • PowerPoint: Get Slide Titles: Given a presentation, this snippet returns a list of the slide titles in the order presented.
    • PowerPoint: Modify Slide Title: Given a presentation, old slide title, and new slide title, this snippet changes the first instance of a slide with the given title to the new value. The snippet returns true if successful, false if not successful.
    • PowerPoint: Reorder Slides: Given a presentation, an original position, and a new position, attempt to place the slide from the original position into the new position within the deck. If the original position is outside the range of the number of slides in the deck, use the last slide. If the new position is outside the range of slides in the deck, put the selected slide at the end of the deck. The snippet returns the loctation wher the slide was placed, or -1 on failure.
    • PowerPoint: Replace Image: Given a presentation, slide title and image file, this snippet replaces the first image on the slide with the given image.
    • PowerPoint: Retrieve Slide Location by Title: Given a presentation and a slide title, this snippet returns the 0-based location of the first slide with a matching title.

    Microsoft Office Word Snippets

    • Word: Accept Revisions: Given a document and an author name, this snippet accepts the revisions by that author.
    • Word: Add Header: Given a document and a stream containing valid header content, add the stream content as a header in the document.
    • Word: Convert DOCM to DOCX: Given a macro-enabled document (.docm), this snippet removes the VBA project and converts the file to a macro-free Word Document (.docx).
    • Word: Remove Comments: Given a Word Document, this snippet removes all the comments.
    • Word: Remove Headers and Footers: This snippet removes all headers and footers from a given Word document.
    • Word: Remove Hidden Text: This snippet removes any hidden text in a given document.
    • Word: Replace Style: Given a document and valid header content, this snippet adds the content as a header in the document.
    • Word: Retrieve Application Property: Given a document name and an app property, this snippet returns the value of the property.
    • Word: Retrieve Core Property: Given a document name and a core property, this snippet returns the value of the property.
    • Word: Retrieve Custom Property: Given a document name and a custom property, this snippet returns the value of the property.
    • Word: Retrieve Table of Contents: Given a document name, this snippet returns a table of contents as an XmlDocument.
    • Word: Set Application Property: This snippet sets a property’s value given a document name, application property and value. The snippet returns the old value if successful.
    • Word: Set Core Property: Given a document name, a core property, and property value, this snippet sets the property value.
    • Word: Set Custom Property: Given a document name, a custom property, and a value, this snippet sets the property’s value. If the property does not exist, create it. Returns true if successful, false if not.
    • Word: Set Print Orientation: Given a document name, this snippet sets the print orientation for all sections in the document.

    Download them here!

  • Ecma Office Open XML File Formats Standard – Status Report - 21 June 2006

    Here's June's update (see also May's and April's) from the Ecma International Technical Committee (TC45) which is working to establish an international open standard for Office Open XML File Formats (as described in the TC45 program of work) :-

    This week (June 19-21), the committee held its fourth face-to-face meeting in Sapporo, Japan. The meeting was hosted by Toshiba and attended by eighteen participants. The technical agenda included new material on SpreadsheetML, PresentationML, DrawingML, compound documents, and the Open Packaging Convention. The committee reviewed issues and modifications to WordProcessingML as proposed by the committee, and finished resolving substantive issues regarding conformance and interoperability. Highlights of the meeting included presentations by Statoil, BP, and Essilor of methods for visualizing and analyzing the schema interdependencies defining the file format, and demonstrations of Java prototype tools by Toshiba to transform Office Open XML to HTML for viewing in any browser. Ecma has requested liaison with ISO/IEC JTC1 SC34 in order to help prepare a possible Open XML submission to ISO/IEC, and SC34 has appointed a liaison officer. The TC45 committee's next face-to-face meeting will be hosted in August by Microsoft in Redmond, Washington.

    The technical committee includes representatives from Apple, Barclays Capital, BP, The British Library, Essilor, Intel, Microsoft, NextPage, Novell, Statoil, and Toshiba. The United States Library of Congress has also recently joined the Ecma TC45 committee. By monitoring and contributing to the formation of this standard, the Library hopes to ensure that office productivity documents created by individuals and organizations using widely available tools can be saved in a form that will remain accessible as technology changes.

    The work is in progress, and the participants in Ecma TC45 are providing the reference schema and specification as working documents, for informational purposes only. The contents are subject to change, and may change monthly; a channel is offered to provide technical feedback on the drafts. There is also a technical forum at http://www.openxmldeveloper.org/ for developers who are interested in using the Ecma Office Open XML file formats. For general information about Ecma International, please contact Christa Rosatzin-Strobel (media@ecma-international.org).

    Tom Ngo (NextPage)
    TC45

    The following organizations have participated in the work of Ecma TC45 and their contributions are gratefully acknowledged:

    Apple, Barclays Capital, BP, The British Library, Essilor, Intel, Microsoft, NextPage, Novell, Statoil, and Toshiba

    Available Documents:

    [Update] Brian Jones has written about the ECMA meeting  in Sappora too. Brian reports that the U.S. Library of Congress has joined Ecma TC-45.

This Blog

Syndication

SSN Program Home | Terms of Use | Privacy Statement
© Copyright 2007 Microsoft Corporation. All rights reserved.