MS-Word | Thomas' Work Space

Unicode conversion keyboard shortcut in Microsoft Word

2017/09/26 plagwitz Leave a comment

A visitor asked me about the whereabouts of the uniqoder website that I recommended here – I am afraid I have lost track. But if you have Microsoft Word, at least you can always easily type Unicode using the conversion ALT-X keyboard shortcut

Categories: Writing Tags: MS-Word, unicode

Inserting Word Building Blocks from Template using Ribbon or QAT

2016/06/22 plagwitz Leave a comment

Either insert from Ribbon / Insert / Quickparts:

Or for extra speed and convenience, you can have your template store access to your building from the Quick Access Toolbar:

Demo video is here.

Categories: office-software, training Tags: building-blocks, MS-Word, qat, quickparts

PowerShell script to save all .pdf’s as .docx in and underneath a folder failing on Word 2016, working on Word 2010.

2016/04/02 plagwitz Leave a comment

Problem: Word 2016 shows erratic behavior when trying to save (admittedly: complex) .PDF as .DOCX – whether
1. using automation
  1. “The object invoked has disconnected from its clients. (Exception from HRESULT: 0x80010108 (RPC_E_DISCONNECTED))”
  2. “The RPC server is unavailable. (Exception from HRESULT: 0x800706BA)”
2. or trying manually.
  1. “There is a problem saving the file.”
  2. “A file error has occurred.”
  3. Or Word crashes.
Workaround: My age-old Word 2010 installation on Windows Vista with PowerShell 2 (gasp! ) manages this automation script (inspired by The Scripting Guy) just fine:

$Word = NEW-OBJECT –COMOBJECT WORD.APPLICATION  
# Acquire a list of DOCX files in a folder
$Files = GET-CHILDITEM -include *.pdf -exclude *_converted.pdf -recurse -path 'G:\bookz\office\excel' # 'G:\bookz\lang\vba' # 'G:\bookz\office\access' # 
 
Foreach ($File in $Files) {
    try{
        write-host "Trying  " $File.fullname 
        # open a Word document, filename from the directory
        $Doc1=$Word.Documents.Open($File.fullname)
        write-host "Opening " $File.fullname ". RESULT=" + $?
        # Swap out PDF with DOCX in the Filename
        $Name=($File.Fullname).replace("pdf",“docx”) # $Name=($Doc1.Fullname).replace("pdf",“docx”)
        # Save this File as a PDF in Word 2010/2013 - hm, and 2016 fails? 
        $Doc1.saveas([ref] $Name, [ref] 16) # see WdSaveFormat enumeration : 16 is word default, 
    }
    catch 
    { 
        $ErrorMessage = $_.Exception.Message
        $FailedItem = $_.Exception.ItemName
        write-host "Caught error saving " $FailedItem ". Msg: " $ErrorMessage 
    } 
    finally {
        $Doc1.close()
        [GC]::Collect() # watch me trying a number of things to get this to work with Word 2016... 🙂
        move-item -path $file.FullName -destination ($file.Directory.ToString() + "\" + $file.BaseName + "_converted" + $file.Extension)
    }
}

Categories: service-is-programming, sourcecode Tags: converting, ms-powershell, MS-Word, pdf

How to ease editing work in MS-Word by automating search/replace operations

2015/05/08 plagwitz Leave a comment

If you frequently have to edit documents according to a large number of editorial rules and regulations
and if you can partially automate these edit operations (or at least highlight suspicious passages for human review) with Word’s search/replace,
I can recommend an add-in that can automate even the repeated search/replace operations (like the 57 in the video below)
and even help you manage your search/replace strings and regular expressions in a spreadsheet which it can load from:
Greg Maxey’s VBA Find & Replace Word Add-in. See it in action (click for full size):
~~Two~~ Three Caveats: :
1. At this point, I cannot get the add-in to work only in Word 2010. Even if I lower Macro security and allow programmatic access to the VBA project, when trying to launch the add-in from the ribbon, Word 2013 complains: “The macro cannot be found or has been disabled due to your macro security settings”:.
2. The automation is only as good as your underlying search/replace operations. (Hint: “Some people, when confronted with a problem, think ‘I know, I’ll use regular expressions.’ Now they have two problems.”)
3. I think I will refrain from search/replace during “Tracking changes” – as in the video – , and rather use “Compare documents” after the replace operations – too many quirks otherwise…

Categories: service-is-documenting Tags: 2010, 2013, add-ins, automation, MS-Excel, MS-Word, regular-expressions, replacing, VBA

Fun with .docx to .html transforms by means of HtmlConverter from PowerTools for Open XML

2014/12/15 plagwitz Leave a comment

The transform is FOSS and platform-independent:
1. It neither requires Office nor Windows (The OpenXML SDK runs on Linux via Mono on the server.
2. However, the most recent installment of Powertools for OpenXML, a high-level API to the OpenXML SDK, comes with a PowerShell interface (benefit: no Visual studio requirement).
Valuable features of the transform, among many other things, are:
1. HtmlConverter is able to translate MS-Word styles into CSS (insofar needed – my code style has “No proofing” set, however, this cannot be implemented on the WWW), so the layout is preserved as designed, but w/o need for inline formatting:

        span.pt-StrongEmphasis-000052 {
            font-family: Calibri;
            font-size: 11pt;
            font-style: italic;
            font-weight: bold;
            margin: 0in;
            padding: 0in;
        }

        span.pt-lowCodeConsoleChar0 {
            color: #FFFFFF;
            background: #000000;
            font-family: Consolas;
            font-size: 10pt;
            font-weight: normal;
            margin: 0in;
            padding: 0in;
        }

     &lt;h3 dir=&quot;ltr&quot; class=&quot;pt-000040&quot;&gt;
            &lt;span class=&quot;pt-000041&quot;&gt;2.2.1&lt;/span&gt;&lt;span class=&quot;pt-000042&quot;&gt;&lt;span class=&quot;pt-000043&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;pt-Heading2Char&quot;&gt;&lt;b&gt;References&lt;/b&gt;&lt;/span&gt;
          &lt;/h3&gt;

          &lt;p dir=&quot;ltr&quot; class=&quot;pt-BodyText&quot;&gt;
            &lt;span class=&quot;pt-DefaultParagraphFont-000003&quot;&gt;&lt;br /&gt;
            &amp;lrm;&lt;/span&gt;&lt;span class=&quot;pt-000000&quot;&gt;&amp;nbsp;&lt;/span&gt;
          &lt;/p&gt;

          &lt;h1 dir=&quot;ltr&quot; class=&quot;pt-000006&quot;&gt;
            &lt;span class=&quot;pt-000007&quot;&gt;&lt;b&gt;3&lt;/b&gt;&lt;/span&gt;&lt;span class=&quot;pt-000008&quot;&gt;&lt;b&gt;&lt;span class=&quot;pt-000009&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;span class=&quot;pt-Heading1Char&quot;&gt;&lt;b&gt;Introduction&lt;/b&gt;&lt;/span&gt;
          &lt;/h1&gt;

          &lt;h2 dir=&quot;ltr&quot; class=&quot;pt-000018&quot;&gt;
            &lt;span class=&quot;pt-000019&quot;&gt;3.1&lt;/span&gt;&lt;span class=&quot;pt-000020&quot;&gt;&lt;span class=&quot;pt-000021&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;pt-Heading2Char&quot;&gt;&lt;b&gt;Purpose of Document&lt;/b&gt;&lt;/span&gt;
          &lt;/h2&gt;

There are many more options that I have not yet tried:

            SimplifyMarkupSettings simplifyMarkupSettings = new SimplifyMarkupSettings
            {
                RemoveComments = true,
                RemoveContentControls = true,
                RemoveEndAndFootNotes = true,
                RemoveFieldCodes = false,
                RemoveLastRenderedPageBreak = true,
                RemovePermissions = true,
                RemoveProof = true,
                RemoveRsidInfo = true,
                RemoveSmartTags = true,
                RemoveSoftHyphens = true,
                RemoveGoBackBookmark = true,
                ReplaceTabsWithSpaces = false,
            };
            MarkupSimplifier.SimplifyMarkup(wordDoc, simplifyMarkupSettings);

            FormattingAssemblerSettings formattingAssemblerSettings = new FormattingAssemblerSettings
            {
                RemoveStyleNamesFromParagraphAndRunProperties = false,
                ClearStyles = false,
                RestrictToSupportedLanguages = htmlConverterSettings.RestrictToSupportedLanguages,
                RestrictToSupportedNumberingFormats = htmlConverterSettings.RestrictToSupportedNumberingFormats,
                CreateHtmlConverterAnnotationAttributes = true,
                OrderElementsPerStandard = false,
                ListItemRetrieverSettings = new ListItemRetrieverSettings()
                {
                    ListItemTextImplementations = htmlConverterSettings.ListItemImplementations,
                },
            };

One would really wish there was a way to get such HTML cleaned up automatically (ouch!):

               &lt;span class=&quot;pt-DefaultParagraphFont-000006&quot;&gt;M&lt;/span&gt;
                &lt;span class=&quot;pt-DefaultParagraphFont-000006&quot;&gt;anaged requirements for system integration&amp;nbsp;&lt;/span&gt;
                &lt;span class=&quot;pt-DefaultParagraphFont-000006&quot;&gt;of Center&lt;/span&gt;
                &lt;span class=&quot;pt-DefaultParagraphFont-000006&quot;&gt;&amp;nbsp;&lt;/span&gt;
                &lt;span class=&quot;pt-DefaultParagraphFont-000006&quot;&gt;software&amp;nbsp;&lt;/span&gt;
                &lt;span class=&quot;pt-DefaultParagraphFont-000006&quot;&gt;with&amp;nbsp;&lt;/span&gt;
                &lt;span class=&quot;pt-DefaultParagraphFont-000006&quot;&gt;iLearning&lt;/span&gt;
                &lt;span class=&quot;pt-DefaultParagraphFont-000006&quot;&gt;&amp;nbsp;and with content production and management (BPD). To mitigate lack of integration of $50k LMS software investment into departmental workflow&lt;/span&gt;
                &lt;span class=&quot;pt-DefaultParagraphFont-000006&quot;&gt;,&lt;/span&gt;
                &lt;span class=&quot;pt-DefaultParagraphFont-000006&quot;&gt;&amp;nbsp;&lt;/span&gt;
                &lt;span class=&quot;pt-DefaultParagraphFont-000006&quot;&gt;developed&amp;nbsp;&lt;/span&gt;
                &lt;span class=&quot;pt-DefaultParagraphFont-000006&quot;&gt;and documented&amp;nbsp;&lt;/span&gt;
                &lt;span class=&quot;pt-DefaultParagraphFont-000006&quot;&gt;software to automate&lt;/span&gt;
                &lt;span class=&quot;pt-DefaultParagraphFont-000006&quot;&gt;&amp;nbsp;creation of 4K+ user accounts p.a., 30K+ learning documents and 100K+ interactive content paths in LMS.&lt;/span&gt;

There are also much more serious conversion errors:

MS-Word displays a plain text content control and a repeating section content control within a table, containing one Combobox and one plain text content control per row, perfectly:

Convert-DocxToHtml gobbles the content completely (and so does Google Docs Preview):

The underlying HTML has just a blank table under each heading:

    &lt;div class=&quot;pt-000001&quot;&gt;
        &lt;p dir=&quot;ltr&quot; class=&quot;pt-qiCVHeading1&quot;&gt;
          &lt;span class=&quot;pt-DefaultParagraphFont-000002&quot;&gt;Profile&lt;/span&gt;
        &lt;/p&gt;
      &lt;/div&gt;
      &lt;div align=&quot;left&quot;&gt;
        &lt;table border=&quot;1&quot; cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; dir=&quot;ltr&quot; class=&quot;pt-000003&quot; /&gt;
      &lt;/div&gt;
      &lt;div class=&quot;pt-000001&quot;&gt;
        &lt;p dir=&quot;ltr&quot; class=&quot;pt-qiCVHeading1&quot;&gt;
          &lt;span class=&quot;pt-DefaultParagraphFont-000002&quot;&gt;Technologies&lt;/span&gt;
        &lt;/p&gt;
      &lt;/div&gt;
      &lt;div align=&quot;left&quot;&gt;
        &lt;table border=&quot;1&quot; cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; dir=&quot;ltr&quot; class=&quot;pt-000003&quot; /&gt;
      &lt;/div&gt;

MS-Word shows:Yet need to look in to the underlying XML to see whether the .docx is to blame for that…
But HtmlConverter output in IE or Firefox: The underlying HTML reveals that the css does not get applied in the right place:

 	&lt;tr&gt;
                &lt;td class=&quot;pt-000079&quot;&gt;
                  &lt;p dir=&quot;ltr&quot; class=&quot;pt-BodyTextSmall&quot;&gt;
                    &lt;span class=&quot;pt-BodyTextSmallChar-000081&quot;&gt;AD&lt;/span&gt;
                  &lt;/p&gt;
                &lt;/td&gt;
                &lt;td colspan=&quot;2&quot; class=&quot;pt-000079&quot;&gt;
                  &lt;p dir=&quot;ltr&quot; class=&quot;pt-BodyTextSmall&quot;&gt;
                    &lt;span class=&quot;pt-BodyTextSmallChar-000081&quot;&gt;Active Driector, Microsfot&amp;rsquo;s directory implementation.&lt;/span&gt;
                  &lt;/p&gt;
                &lt;/td&gt;
              &lt;/tr&gt;

              &lt;tr&gt;
                &lt;td class=&quot;pt-000086&quot;&gt;
                  &lt;p dir=&quot;ltr&quot; class=&quot;pt-BodyTextSmall&quot;&gt;
                    &lt;span class=&quot;pt-000085&quot;&gt;&amp;nbsp;&lt;/span&gt;
                  &lt;/p&gt;
                &lt;/td&gt;
                &lt;td colspan=&quot;2&quot; class=&quot;pt-000086&quot;&gt;
                  &lt;p dir=&quot;ltr&quot; class=&quot;pt-BodyTextSmall&quot;&gt;
                    &lt;span class=&quot;pt-000085&quot;&gt;&amp;nbsp;&lt;/span&gt;
                  &lt;/p&gt;
                &lt;/td&gt;
              &lt;/tr&gt;

One could imagine MS-Word acting less strictly than OpenXML PowerTools:Convert-DocxToHtml, like a web-browser’s parser tolerates and displays bad HTML. However, not only would need to be justified how MS-Word can also serve as the originating HTML WYSIWYG editor. The OpenXML PowerTools:Get-OpenXmlValidationErrors for both of the above documents does not seem to find any OpenXML errors that could explain the bad conversion (other than dozens of Sch_UndeclaredAttribute errors (Version-related? Not sure how this could be) , there is only a Pkg_PartIsNotAllowed relating to a glossary).

Also yet to do:
1. When (not always!) does my page title end up as empty?
```
&lt;title&gt;&lt;/title&gt;
```
2. Defaults to doctype xhtml, not html(5).
Done:
```
openXml\OxPt\OxPtCmdlets\OxPtHelper.cs:var htmlString = html.ToString(SaveOptions.None); // trp: requesting pretty-printing, was:html.ToString(SaveOptions.DisableFormatting);
```

Categories: e-infrastructure, service-is-documenting Tags: Convert-DocxToHtml, Get-OpenXmlValidationErrors, MS-Word, openxml, powertools-for-openxml, single-sourcing, XML

Fun with Zotero inserting citations and bibliographies

2014/11/17 plagwitz Leave a comment

If you can install Zotero’s word processor add-ins (for LibreOffice Writer or MS-Word).:
If you cannot, you can still use the “create bibliography from items” of Zotero (which itself can be run under portable Firefox from a USB stick – no install needed at all). Here is a brief example and insert those into your writing;

Categories: animated-GIFs, service-is-library, training Tags: bibliographies, MS-Word, zotero

Older Entries

Thomas' Work Space

Archive

Unicode conversion keyboard shortcut in Microsoft Word

Inserting Word Building Blocks from Template using Ribbon or QAT

PowerShell script to save all .pdf’s as .docx in and underneath a folder failing on Word 2016, working on Word 2010.

How to easily rearrange your sections via the headings in MS-Word’s Navigation Pane

How to ease editing work in MS-Word by automating search/replace operations

Fun with .docx to .html transforms by means of HtmlConverter from PowerTools for Open XML

Fun with Zotero inserting citations and bibliographies

Blog Stats

Top Posts & Pages

Top Clicks

Categories

Email Subscription

Archives

Top