## August 27, 2014

### From Lyx/Latex to Word

This is sort of a placeholder post. Busy meeting a deadline, but this should help future Steve and anyone else when you need to turn your Lyx document into a Word document while keeping the format mostly sane. Broken, but sane.
1. Export as Latex (plain).
2. Run
• latex <name of tex file, with or without extension>

3. Run
• bibtex <filename>

4. Run
• latex <filename>

5. Run
• latex <filename>

6. Try to run
• htlatex <filename> "html,0,charset=utf-8" "" -dhtml/
• html: format to output
• 0: normally chapters go into their own page, putting 0 here forces everything into a single page
• charset=utf-8: let us be civilised
• -dhtml/: puts the output files in a html sub-directory. Note that you can't have a space between -d and the html/

7. If the above fails with something like 'illegal storage address', and you get a warning about text4ht.env not been found, then you need to find where it is in your TeX installation, and:
• export TEX4HTENV and try again
• Copy text4ht.env into your working directory
• This approach also lets you affect locally some export parameters. More on this later...

8. Open the html to verify correctness. You might object to the poor graphics quality. In this case copy text4ht.env into the working directory if you haven't done so, and then modify it so it uses a high density when converting images.
• See this tex.stackexchange.com answer for more details
• In my case, since dvipng was been used, I replaced all instances of
• -D 96
• with
• -D 300

9. It also helps if you
• These look like <!-- xxx -->
• centre aligned image divs
• remove <hr/> instances

10. These changes will make the import into Libre/OpenOffice go easier

11. Open the html file in Libre/OpenOffice

12. File > Export > ODT

13. Close html file

14. Open exported ODT

18. Verify that the ODT file is now much larger!

19. File > Save As > Word 97 (doc)

Phew! To help future visitors, a simple python script to fix up the html as I have described is included at the end of this post. You will need lxml and cssselector installed. Cheers, Steve
#!/usr/bin/env python

from lxml.html import parse, HtmlComment
from lxml import etree

def main(*args):
if len(args) == 0:
return 1

doc = parse(args[0]).getroot()

body = doc.cssselect('body')[0]

# replace <hr/> with <br/> to make doc conversion easier
for hr in body.cssselect('hr'):
p = hr.getparent()
p.remove(hr)

br = etree.Element('br')
p.append(br)

# remove comments because for some reason libreoffice opens up