Bug 39795

Summary: ACCESSIBILITY: Writer XHTML export loses language information [accessibility]
Product: LibreOffice Reporter: Christophe Strobbe <c_strobbe-fdo>
Component: WriterAssignee: Not Assigned <libreoffice-bugs>
Status: NEW --- QA Contact:
Severity: enhancement    
Priority: medium CC: c_strobbe-fdo, sasha.libreoffice, vstuart.foote
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on: 39937    
Bug Blocks: 60251    

Description Christophe Strobbe 2011-08-03 03:18:38 UTC
When an OpenDocument Text (file) is exported to XHTML, the exported code does not contain lang attributes that identify the document's default language or the language changes inside the document.

Steps to reproduce the issue:
1. Create a new Writer document and insert some text in English.
2. Add a paragraph in French (e.g. copy something from fr.wikipedia.org).
3. Go to File > Export > and choose XHTML.
4. Inspect the exported XHTML file in a source code editor and search for 'lang="'.

What the XHTML *should* have is:
1. lang="en" (possibly lang="en-US" or lang="en-GB", depending on the language specified for the Writer document) on the HTML element;
2. lang="..." on elements where the language changes compared to the immediate context (i.e. nearest ancestor).

Notes:
* xml:lang is also in use, but is not supported by screen readers or software for dyslexics; screen readers are used by blind users to convert content to synthetic speech and/or Braille, and correct language identification is essential for both synthetic speech and Braille.
* Using Dublin Core metadata (e.g. <meta name="DCTERMS.language" content="en-US"...) specifies the expected audience language, but not the text processing language.

Background:
* <http://www.w3.org/International/tutorials/language-decl/#Slide0140>: "Declaring the text-processing language" (in W3C tutorial);
* WCAG 2.0 technique H57: Using language attributes on the html element: <http://www.w3.org/TR/2010/NOTE-WCAG20-TECHS-20101014/H57>
* WCAG 2.0 technique H58: Using language attributes to identify changes in the human language: <http://www.w3.org/TR/2010/NOTE-WCAG20-TECHS-20101014/H58.html>
Comment 1 Christophe Strobbe 2011-08-08 10:33:32 UTC
Added dependency on Bug 39937 because the XSLT for XHTML export assumes that a dc:language element exists.
Comment 2 Björn Michaelsen 2011-12-23 12:28:24 UTC
[This is an automated message.]
This bug was filed before the changes to Bugzilla on 2011-10-16. Thus it
started right out as NEW without ever being explicitly confirmed. The bug is
changed to state NEEDINFO for this reason. To move this bug from NEEDINFO back
to NEW please check if the bug still persists with the 3.5.0 beta1 or beta2 prereleases.
Details on how to test the 3.5.0 beta1 can be found at:
http://wiki.documentfoundation.org/QA/BugHunting_Session_3.5.0.-1

more detail on this bulk operation: http://nabble.documentfoundation.org/RFC-Operation-Spamzilla-tp3607474p3607474.html
Comment 3 sasha.libreoffice 2012-01-08 21:33:47 UTC
reproduced in LibO 3.5.0 beta 1

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.