Archive
Posts Tagged ‘XML’
PowerShell Script to convert your Testing Anywhere run logs into a Excel pivot table data source
2016/01/28
Leave a comment
- If confronted with a sizable Testing Anywhere test script codebase which has been marginally, but not substantially enhanced/cleaned up in several years while producing a barrage of automation errors daily,
- you may find that the run suite errors that Testing Anywhere logs automatically in its rlgx files are your best data source for monitoring and designing a plan of attack:
- Any oft-failing scripts should be put last during the daily run? how about length script needs to run?
- Any failing script parts could be modularized and during the daily run?

- any oft-failing scripts? E.g. here the top 8% of failing scripts have almost 30% of the errors.

- Any oft-failing approaches that might benefit from refactoring? Starting with which scripts? Main actions, then sub-actions:


- etc.
- Then this PowerShell script may help which
- extracts the non binary <runlog> items out of the binary rlgx files,
- and merges them into a single file
- which it wraps with an XML declaration and root level node that Excel can work with.
add-content -value '' -path C:\td\testinganywhere\files\rlgx\all-a-rlgx.xml -Encoding UTF8 Get-childitem -path C:\td\testinganywhere\files\rlgx\arnold-pc1 |
? {$_.Extension -eq ".rlgx"} |
% { $file = convertto-string $_.FullName
$match = [regex]::Match($file,'\s+(.*)\s+',"SingleLine,IgnoreCase").value
add-content $match -path C:\td\testinganywhere\files\rlgx\all-a-rlgx.xml -Encoding UTF8 }
add-content '' -path C:\td\testinganywhere\files\rlgx\all-a-rlgx.xml -Encoding UTF8
- Make this PowerShell script a Scheduled Task,
- So that you can auto-update said XML which you made the data source for your Excel monitoring/planning work book.
- The post-processing of the default error log messages that makes meaningful pivoting actually possible, is left as an exercise to the reader by Testing Anywhere
.
- The post-processing of the default error log messages that makes meaningful pivoting actually possible, is left as an exercise to the reader by Testing Anywhere
Categories: service-is-programming, service-is-testing
MS-Excel, ms-powershell, pivot-tables, rlgx, testing-anywhere, XML
Fun with .docx to .html transforms by means of HtmlConverter from PowerTools for Open XML
2014/12/15
Leave a comment
- The transform is FOSS and platform-independent:
- It neither requires Office nor Windows (The OpenXML SDK runs on Linux via Mono on the server.
- However, the most recent installment of Powertools for OpenXML, a high-level API to the OpenXML SDK, comes with a PowerShell interface (benefit: no Visual studio requirement).
- Valuable features of the transform, among many other things, are:
- HtmlConverter is able to translate MS-Word styles into CSS (insofar needed – my code style has “No proofing” set, however, this cannot be implemented on the WWW), so the layout is preserved as designed, but w/o need for inline formatting:
span.pt-StrongEmphasis-000052 {
font-family: Calibri;
font-size: 11pt;
font-style: italic;
font-weight: bold;
margin: 0in;
padding: 0in;
}
span.pt-lowCodeConsoleChar0 {
color: #FFFFFF;
background: #000000;
font-family: Consolas;
font-size: 10pt;
font-weight: normal;
margin: 0in;
padding: 0in;
}
<h3 dir="ltr" class="pt-000040">
<span class="pt-000041">2.2.1</span><span class="pt-000042"><span class="pt-000043">&nbsp;</span></span><span class="pt-Heading2Char"><b>References</b></span>
</h3>
<p dir="ltr" class="pt-BodyText">
<span class="pt-DefaultParagraphFont-000003"><br />
&lrm;</span><span class="pt-000000">&nbsp;</span>
</p>
<h1 dir="ltr" class="pt-000006">
<span class="pt-000007"><b>3</b></span><span class="pt-000008"><b><span class="pt-000009">&nbsp;</span></b></span><span class="pt-Heading1Char"><b>Introduction</b></span>
</h1>
<h2 dir="ltr" class="pt-000018">
<span class="pt-000019">3.1</span><span class="pt-000020"><span class="pt-000021">&nbsp;</span></span><span class="pt-Heading2Char"><b>Purpose of Document</b></span>
</h2>
- There are many more options that I have not yet tried:
SimplifyMarkupSettings simplifyMarkupSettings = new SimplifyMarkupSettings
{
RemoveComments = true,
RemoveContentControls = true,
RemoveEndAndFootNotes = true,
RemoveFieldCodes = false,
RemoveLastRenderedPageBreak = true,
RemovePermissions = true,
RemoveProof = true,
RemoveRsidInfo = true,
RemoveSmartTags = true,
RemoveSoftHyphens = true,
RemoveGoBackBookmark = true,
ReplaceTabsWithSpaces = false,
};
MarkupSimplifier.SimplifyMarkup(wordDoc, simplifyMarkupSettings);
FormattingAssemblerSettings formattingAssemblerSettings = new FormattingAssemblerSettings
{
RemoveStyleNamesFromParagraphAndRunProperties = false,
ClearStyles = false,
RestrictToSupportedLanguages = htmlConverterSettings.RestrictToSupportedLanguages,
RestrictToSupportedNumberingFormats = htmlConverterSettings.RestrictToSupportedNumberingFormats,
CreateHtmlConverterAnnotationAttributes = true,
OrderElementsPerStandard = false,
ListItemRetrieverSettings = new ListItemRetrieverSettings()
{
ListItemTextImplementations = htmlConverterSettings.ListItemImplementations,
},
};
- One would really wish there was a way to get such HTML cleaned up automatically (ouch!):
<span class="pt-DefaultParagraphFont-000006">M</span>
<span class="pt-DefaultParagraphFont-000006">anaged requirements for system integration&nbsp;</span>
<span class="pt-DefaultParagraphFont-000006">of Center</span>
<span class="pt-DefaultParagraphFont-000006">&nbsp;</span>
<span class="pt-DefaultParagraphFont-000006">software&nbsp;</span>
<span class="pt-DefaultParagraphFont-000006">with&nbsp;</span>
<span class="pt-DefaultParagraphFont-000006">iLearning</span>
<span class="pt-DefaultParagraphFont-000006">&nbsp;and with content production and management (BPD). To mitigate lack of integration of $50k LMS software investment into departmental workflow</span>
<span class="pt-DefaultParagraphFont-000006">,</span>
<span class="pt-DefaultParagraphFont-000006">&nbsp;</span>
<span class="pt-DefaultParagraphFont-000006">developed&nbsp;</span>
<span class="pt-DefaultParagraphFont-000006">and documented&nbsp;</span>
<span class="pt-DefaultParagraphFont-000006">software to automate</span>
<span class="pt-DefaultParagraphFont-000006">&nbsp;creation of 4K+ user accounts p.a., 30K+ learning documents and 100K+ interactive content paths in LMS.</span>
- There are also much more serious conversion errors:
- MS-Word displays a plain text content control and a repeating section content control within a table, containing one Combobox and one plain text content control per row, perfectly:

- Convert-DocxToHtml gobbles the content completely (and so does Google Docs Preview):
The underlying HTML has just a blank table under each heading:
<div class="pt-000001"> <p dir="ltr" class="pt-qiCVHeading1"> <span class="pt-DefaultParagraphFont-000002">Profile</span> </p> </div> <div align="left"> <table border="1" cellspacing="0" cellpadding="0" dir="ltr" class="pt-000003" /> </div> <div class="pt-000001"> <p dir="ltr" class="pt-qiCVHeading1"> <span class="pt-DefaultParagraphFont-000002">Technologies</span> </p> </div> <div align="left"> <table border="1" cellspacing="0" cellpadding="0" dir="ltr" class="pt-000003" /> </div> - MS-Word shows:
Yet need to look in to the underlying XML to see whether the .docx is to blame for that… - But HtmlConverter output in IE or Firefox:
The underlying HTML reveals that the css does not get applied in the right place:
- MS-Word displays a plain text content control and a repeating section content control within a table, containing one Combobox and one plain text content control per row, perfectly:
<tr>
<td class="pt-000079">
<p dir="ltr" class="pt-BodyTextSmall">
<span class="pt-BodyTextSmallChar-000081">AD</span>
</p>
</td>
<td colspan="2" class="pt-000079">
<p dir="ltr" class="pt-BodyTextSmall">
<span class="pt-BodyTextSmallChar-000081">Active Driector, Microsfot&rsquo;s directory implementation.</span>
</p>
</td>
</tr>
<tr>
<td class="pt-000086">
<p dir="ltr" class="pt-BodyTextSmall">
<span class="pt-000085">&nbsp;</span>
</p>
</td>
<td colspan="2" class="pt-000086">
<p dir="ltr" class="pt-BodyTextSmall">
<span class="pt-000085">&nbsp;</span>
</p>
</td>
</tr>
- One could imagine MS-Word acting less strictly than OpenXML PowerTools:Convert-DocxToHtml, like a web-browser’s parser tolerates and displays bad HTML. However, not only would need to be justified how MS-Word can also serve as the originating HTML WYSIWYG editor. The OpenXML PowerTools:Get-OpenXmlValidationErrors for both of the above documents does not seem to find any OpenXML errors that could explain the bad conversion (other than dozens of Sch_UndeclaredAttribute errors (Version-related? Not sure how this could be) , there is only a Pkg_PartIsNotAllowed relating to a glossary).
- Also yet to do:
- When (not always!) does my page title end up as empty?
<title></title>
- Defaults to doctype xhtml, not html(5).
- When (not always!) does my page title end up as empty?
- Done:
- Pretty-printing. The HtmlConverter output defaults to all content (not css ) on 1 line (e.g. in the example from which above code is taken, 90000chars long). For human readability, and also possibly git tracking, pretty-printing would be better. Can be enforced like so (is there a better way? cannot see a user-configurable option for the SaveOptions enumeration):
openXml\OxPt\OxPtCmdlets\OxPtHelper.cs:var htmlString = html.ToString(SaveOptions.None); // trp: requesting pretty-printing, was:html.ToString(SaveOptions.DisableFormatting);
Enterprise Library Logging Sample
2014/07/03
Leave a comment
Using Enterprise Library (still on 5), You can declaratively configure the logger properties (including desired formatting, see Textformatter template below)) in the app.config’s appsettings:
<loggingConfiguration name="Logging Application Block" tracingEnabled="true"
defaultCategory="General" logWarningsWhenNoCategoriesMatch="true"> <listeners> <add name="Event Log Listener" type="Microsoft.Practices.EnterpriseLibrary.Logging.TraceListeners.FormattedEventLogTraceListener, Microsoft.Practices.EnterpriseLibrary.Logging, Version=5.0.505.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"
listenerDataType="Microsoft.Practices.EnterpriseLibrary.Logging.Configuration.FormattedEventLogTraceListenerData, Microsoft.Practices.EnterpriseLibrary.Logging, Version=5.0.505.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"
source="Enterprise Library Logging" formatter="Text Formatter 2"
log="" machineName="." traceOutputOptions="None" /> <add name="Rolling Flat File Trace Listener" type="Microsoft.Practices.EnterpriseLibrary.Logging.TraceListeners.RollingFlatFileTraceListener, Microsoft.Practices.EnterpriseLibrary.Logging, Version=5.0.505.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"
listenerDataType="Microsoft.Practices.EnterpriseLibrary.Logging.Configuration.RollingFlatFileTraceListenerData, Microsoft.Practices.EnterpriseLibrary.Logging, Version=5.0.505.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"
fileName="%AppData%\trpsoft\langlabemailer\trace-rolling.log"
footer="" formatter="Text Formatter" header="" rollFileExistsBehavior="Increment"
rollInterval="Day" rollSizeKB="1000" maxArchivedFiles="10" traceOutputOptions="None" /> <add name="Flat File Trace Listener" type="Microsoft.Practices.EnterpriseLibrary.Logging.TraceListeners.FlatFileTraceListener, Microsoft.Practices.EnterpriseLibrary.Logging, Version=5.0.505.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"
listenerDataType="Microsoft.Practices.EnterpriseLibrary.Logging.Configuration.FlatFileTraceListenerData, Microsoft.Practices.EnterpriseLibrary.Logging, Version=5.0.505.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"
fileName="%AppData%\trpsoft\langlabemailer\exception.log" header=""
footer="" formatter="Text Formatter" traceOutputOptions="None" /> <add name="Rolling Flat File Trace Listener 2" type="Microsoft.Practices.EnterpriseLibrary.Logging.TraceListeners.RollingFlatFileTraceListener, Microsoft.Practices.EnterpriseLibrary.Logging, Version=5.0.505.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"
listenerDataType="Microsoft.Practices.EnterpriseLibrary.Logging.Configuration.RollingFlatFileTraceListenerData, Microsoft.Practices.EnterpriseLibrary.Logging, Version=5.0.505.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"
fileName="%AppData%\trpsoft\langlabemailer\exception-rolling.log"
footer="" formatter="Text Formatter" header="" rollFileExistsBehavior="Increment"
rollInterval="Hour" rollSizeKB="100" maxArchivedFiles="10" filter="All" /> </listeners> <formatters> <add type="Microsoft.Practices.EnterpriseLibrary.Logging.Formatters.TextFormatter, Microsoft.Practices.EnterpriseLibrary.Logging, Version=5.0.505.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"
template="Timestamp {timestamp} Message {message} Category {category} Priority {priority} EventId {eventid} Severity {severity} Title {title} Machine {localMachine} App Domain {localAppDomain} ProcessId {localProcessId} Process Name {localProcessName} Thread Name {threadName} Win32 ThreadId {win32ThreadId} Extended Properties {dictionary({key} - {value})}"
name="Text Formatter" /> <add type="Microsoft.Practices.EnterpriseLibrary.Logging.Formatters.TextFormatter, Microsoft.Practices.EnterpriseLibrary.Logging, Version=5.0.505.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"
template="Timestamp: {timestamp}{newline}
Message: {message}{newline}
Category: {category}{newline}
Priority: {priority}{newline}
EventId: {eventid}{newline}
Severity: {severity}{newline}
Title:{title}{newline}
Machine: {localMachine}{newline}
App Domain: {localAppDomain}{newline}
ProcessId: {localProcessId}{newline}
Process Name: {localProcessName}{newline}
Thread Name: {threadName}{newline}
Win32 ThreadId:{win32ThreadId}{newline}
Extended Properties: {dictionary({key} - {value}{newline}
)}"
name="Text Formatter 2" /> </formatters>
<categorySources> <add switchValue="All" name="General"> <listeners> <add name="Rolling Flat File Trace Listener" /> </listeners> </add> <add switchValue="All" name="Exceptions"> <listeners> <add name="Event Log Listener" /> <add name="Rolling Flat File Trace Listener 2" /> </listeners> </add> </categorySources> <specialSources> <allEvents switchValue="All" name="All Events" /> <notProcessed switchValue="All" name="Unprocessed Category" /> <errors switchValue="All" name="Logging Errors & Warnings"> <listeners> <add name="Event Log Listener" /> </listeners> </errors> </specialSources> </loggingConfiguration> <exceptionHandling> <exceptionPolicies> <add name="Log and Rethrow"> <exceptionTypes> <add name="All Exceptions" type="System.Exception, mscorlib, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"
postHandlingAction="NotifyRethrow"> <exceptionHandlers> <add name="Logging Exception Handler" type="Microsoft.Practices.EnterpriseLibrary.ExceptionHandling.Logging.LoggingExceptionHandler, Microsoft.Practices.EnterpriseLibrary.ExceptionHandling.Logging, Version=5.0.505.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"
logCategory="Exceptions" eventId="100" severity="Error" title="Enterprise Library Exception Handling"
formatterType="Microsoft.Practices.EnterpriseLibrary.ExceptionHandling.TextExceptionFormatter, Microsoft.Practices.EnterpriseLibrary.ExceptionHandling"
priority="0" /> </exceptionHandlers> </add> </exceptionTypes> </add> </exceptionPolicies> </exceptionHandling> <appSettings>
Import and call the logger like so:
using Microsoft.Practices.EnterpriseLibrary.ExceptionHandling.Logging;
using Microsoft.Practices.EnterpriseLibrary.Logging;
Logger.Write("regex:RegExRecordingFileGroup - target:" + "\t" + _filenamenoext + "\t" + strGroups);
the latter can be easily imported and analyzed in MS-Excel:
These are obviously only the simplest examples, study the Enterprise Library documentation for more customization
My DkPro settings.xml
2012/06/04
Leave a comment
<?xml version="1.0" encoding="utf-8"?>
<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0 http://maven.apache.org/xsd/settings-1.0.0.xsd">
<profiles>
<profile>
<id>ukp-oss-releases</id>
<repositories>
<repository>
<id>ukp-oss-releases</id>
<url>http://zoidberg.ukp.informatik.tu-darmstadt.de/artifactory/public-releases</url>
<releases>
<enabled>true</enabled>
<updatePolicy>never</updatePolicy>
<checksumPolicy>warn</checksumPolicy>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>ukp-oss-releases</id>
<url>http://zoidberg.ukp.informatik.tu-darmstadt.de/artifactory/public-releases</url>
<releases>
<enabled>true</enabled>
<updatePolicy>never</updatePolicy>
<checksumPolicy>warn</checksumPolicy>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</pluginRepository>
</pluginRepositories>
</profile>
<profile>
<id>ukp-oss-snapshots</id>
<repositories>
<repository>
<id>ukp-oss-snapshots</id>
<url>http://zoidberg.ukp.informatik.tu-darmstadt.de/artifactory/public-snapshots</url>
<releases>
<enabled>false</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>
</profile>
</profiles>
<activeProfiles>
<activeProfile>ukp-oss-releases</activeProfile>
<!-- voriges profile darf nicht auskommentiert werden -->
<!-- Uncomment the following entry if you need SNAPSHOT versions. -->
<activeProfile>ukp-oss-snapshots</activeProfile>
</activeProfiles>
</settings>
Categories: service-is-learning-materials-creation, service-is-programming
DkPro, eclipse, ide, m2eclipse, maven, nlp, subclipse, XML


How to get Square brackets (and hide comments) with ISO690 in Word 2013 bibliography styles
<!– trp: –>
[
–>
ISO 690YOURNAMEHERE