Suppose we have 1 folder where a manifest.xml is stored and some other files (basictypes.xml and packages.xml) which are referenced by the manifest file. These files contain multiple objects of a specific type and we want to split those in separate files.
There are some hurdles to overcome:
manifest.xml
packages.xml
In the XSLT below I first chose a grouping strategy to resolve the error of writing duplicate items to the same URI. Next I had to use a abstract function getURI for all element cases (basictype and package) which delegates the call to matching templates of @mode="getURI". I only use @mode="write" for the first element in each group and use @mode="skip" for all subsequent elements of that group. For this purpose I only log a messsage that i'm skipping them but I could also have implemented that handler differently like writing them to another folder. Only thing I would have to make sure of would be to include some unique identifiable part in the URI. I could e.g. use generate-id().
The output of running this transformation nicely reports what's happening.
- As some objects are logically duplicates (same identifier) which would be written to the same URI this would result in an exception.
SystemID: C:\pelssers\demo\manifest_transformer.xsl Engine name: Saxon-HE 9.3.0.5 Severity: fatal Description: Cannot write more than one result document to the same URI: file:/c:/pelssers/demo/export/basictypes/PH3330L.xml Start location: 27:0 URL: http://www.w3.org/TR/xslt20/#err-XTDE1490
- Second difficulty is that they are not identifiable with the same xpath-expression so to use 1 single group-by declaration for this heterogeneous bunch of elements needed a bit of thinking. I had to resort to a "Generic" function that would delegate to matching templates for the specific type of element.
manifest.xml
<?xml version="1.0" encoding="UTF-8"?> <manifest> <file href="basictypes.xml"/> <file href="packages.xml"/> </manifest>
basictypes.xml
<?xml version="1.0" encoding="UTF-8"?> <basictypes> <basictype identifier="PH3330L"> <description>N-channel TrenchMOS logic level FET</description> <magcode>R73</magcode> </basictype> <basictype identifier="BUK3F00-50WDFE"> <description>9675 AUTO IC (IMPULSE)</description> <magcode>R73</magcode> </basictype> <basictype identifier="PH3330L"> <description>this is a duplicate of PH3330L</description> <magcode>R73</magcode> </basictype> </basictypes>
packages.xml
<?xml version="1.0" encoding="UTF-8"?> <packages> <package id="SOT669"> <description>plastic single-ended surface-mounted package; 4 leads</description> <name>LFPAK; Power-SO8</name> </package> <package id="SOT600-1"> <description>plastic thin fine-pitch ball grid array package;</description> <name>TFBGA208</name> </package> </packages>
In the XSLT below I first chose a grouping strategy to resolve the error of writing duplicate items to the same URI. Next I had to use a abstract function getURI for all element cases (basictype and package) which delegates the call to matching templates of @mode="getURI". I only use @mode="write" for the first element in each group and use @mode="skip" for all subsequent elements of that group. For this purpose I only log a messsage that i'm skipping them but I could also have implemented that handler differently like writing them to another folder. Only thing I would have to make sure of would be to include some unique identifiable part in the URI. I could e.g. use generate-id().
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:pelssers="http://robbypelssers.blogspot.com" version="2.0"> <xsl:param name="sourceFolder" select="xs:anyURI('file:///c:/pelssers/demo/')"/> <xsl:param name="destinationFolder" select="xs:anyURI('file:///c:/pelssers/demo/export/')"/> <xsl:function name="pelssers:getURI" as="xs:anyURI"> <xsl:param name="element" as="element()"/> <xsl:apply-templates select="$element" mode="getURI"/> </xsl:function> <xsl:template match="/"> <xsl:variable name="elements" select="for $doc in (for $href in manifest/file/@href return document(xs:anyURI(concat($sourceFolder, $href))) ) return $doc/*/*"/> <xsl:for-each-group select="$elements" group-by="pelssers:getURI(.)"> <xsl:apply-templates select="current-group()[1]" mode="write"/> <xsl:apply-templates select="subsequence(current-group(), 2)" mode="skip"/> </xsl:for-each-group> </xsl:template> <xsl:template match="basictype | package" mode="write"> <xsl:variable name="uri" select="pelssers:getURI(.)"/> <xsl:message>Processing <xsl:value-of select="local-name()"/> to URI <xsl:value-of select="$uri"/> </xsl:message> <xsl:result-document method="xml" href="{$uri}"> <xsl:element name="{../local-name()}"> <xsl:apply-templates select="../@*"/> <xsl:copy-of select="."/> </xsl:element> </xsl:result-document> </xsl:template> <xsl:template match="basictype | package" mode="skip"> <xsl:variable name="uri" select="pelssers:getURI(.)"/> <xsl:message>Warning !! Skipping duplicate <xsl:value-of select="local-name()"/> with URI <xsl:value-of select="$uri"/> </xsl:message> </xsl:template> <xsl:template match="basictype" as="xs:anyURI" mode="getURI"> <xsl:sequence select="xs:anyURI(concat($destinationFolder, 'basictypes/', @identifier, '.xml'))"/> </xsl:template> <xsl:template match="package" as="xs:anyURI" mode="getURI"> <xsl:sequence select="xs:anyURI(concat($destinationFolder, 'packages/', @id, '.xml'))"/> </xsl:template> </xsl:stylesheet>
The output of running this transformation nicely reports what's happening.
[Saxon-HE] Processing basictype to URI file:///c:/pelssers/demo/export/basictypes/PH3330L.xml [Saxon-HE] Warning !! Skipping duplicate basictype with URI file:///c:/pelssers/demo/export/basictypes/PH3330L.xml [Saxon-HE] Processing basictype to URI file:///c:/pelssers/demo/export/basictypes/BUK3F00-50WDFE.xml [Saxon-HE] Processing package to URI file:///c:/pelssers/demo/export/packages/SOT669.xml [Saxon-HE] Processing package to URI file:///c:/pelssers/demo/export/packages/SOT600-1.xml
No comments:
Post a Comment