If you're interested in functional programming, you might also want to checkout my second blog which i'm actively working on!!

Thursday, June 14, 2012

Regular xquery versus xquery using index [Sedna XMLDB]

This article will show you 2 different ways of querying for data. Mainly accessing the collection directly versus using an index to fetch the data.
declare function local:getBasicTypes($productIds as xs:string*) as element(Product)* {
    (: remark: @identifier = $productIds acts like a SQL @identifier in $productIds :)
    collection("basicTypes/released")/Product[@identifier = $productIds]
};

let $ids := ('PH3330L', 'PH3030CL')
return 

  {
    for $bt in local:getBasicTypes($ids)
    return $bt/ProductInformation/Description
  }


Now let us create an index for the @identifier and retrieve the data using this index
create index "basictype_id"
  on fn:collection("basicTypes/released")/Product
  by @identifier
  as xs:string

declare function local:getBasicTypesByIndex($productIds as xs:string*) as element(Product)* {
    for $id in $productIds return index-scan('basictype_id', $id, 'EQ')
};

let $ids := ('PH3330L', 'PH3030CL')
return 

  {
    for $bt in local:getBasicTypesByIndex($ids)
    return $bt/ProductInformation/Description
  }


The result is the same for both:
<result>
  <Description>N-channel TrenchMOS logic level FET</Description>
  <Description>9657 Trench 7 (IMPULSE)</Description>
</result>

Here is an example of creating an index using namespaces:
declare namespace cat = "urn:iso:std:iso:ts:29002:-10:ed-1:tech:xml-schema:catalogue";
create index "legacy_id"
  on fn:collection("legacyBasicTypes")/cat:catalogue
  by cat:item/cat:reference/@reference_number
  as xs:string

Wednesday, June 13, 2012

XSLT not powerful, really??


<?xml version="1.0"?>
<sparql>
  <head>
    <variable name="subnode"/>
  </head>
  <results>
    <result>
      <binding name="subnode">
        <uri>http://data.kasabi.com/dataset/nxp-products/basicType/LPC1114FA44</uri>
      </binding>
    </result>
    <result>
      <binding name="subnode">
        <uri>http://data.kasabi.com/dataset/nxp-products/productTree/1498</uri>
      </binding>
    </result>
  </results>
</sparql>

We want to extract all identifiers for each URI on a newline saved as text. So expected output for this sample is
LPC1114FA44
1498

There are multiple ways to solve this but the solution below is a pretty easy one. It string-joins the last part of the tokenized uri with a newline character.
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">

  <xsl:output  method="text" encoding="UTF-8" media-type="text/plain"/>
  
  <xsl:template match="/">
    <xsl:variable name="uris" select="/sparql/results/result/binding/uri"/>
    <xsl:value-of select="string-join(for $uri in $uris return tokenize($uri, '/')[last()], '&#xa;')"/>
  </xsl:template>
  
</xsl:stylesheet>

Thursday, June 7, 2012

Managing XML Schemas and Modules with Sedna

At my customers site we use the Sedna XMLDB. In the beginning we stored all XML schemas used to validate on insertion in a custom XMLDB wrapper on top of the Sedna driver for Java. That appeared to be a bit tedious as we regularly have schema changes and we had to redeploy all projects using the XMLDB wrapper. We also stored some schemas redundantly in a Cocoon application which generates some data on the fly using transformations. We solved this problem by also storing the XML schemas in the XMLDB itself. I also installed 2 admin clients but the one I use most is SDBAdmin for windows which can be downloaded from here. It's pretty easy to connect to your database on localhost (default port is 5050). You can also use putty and setup a tunnel to your QA or PROD server.







Loading or replacing a module can be done by using following command in Query panel:
LOAD OR REPLACE MODULE "C:\workspaces\nxp\xmldbClient\src\main\resources\xquery\basictype.xqlib"

Replacing an XML schema would take two steps. If you are uploading a schema for the first time only the second command needs to be executed
DROP DOCUMENT "ChemicalContent.xsd" IN COLLECTION "xmlSchemas"

LOAD "C:/development/workspaces/intellij11/CTPI-PX/spider2/schemas/ChemicalContent.xsd" "ChemicalContent.xsd" "xmlSchemas"

You can actually also execute commands using the terminal:
pxqa1@nlscli72:/appl/spider_qa/sedna/pxqa1/sedna/bin>./se_term nxp
Welcome to term, the SEDNA Interactive Terminal. Type \? for help.
nxp> LOAD "/home/pxqa1/deployment/iso29002-10xml_V099.xml" "legacy.xml" "legacy" &
Bulk load succeeded
nxp>

Thursday, April 12, 2012

Where XQuery meets XSLT2.0

Goal: Return input element whose @value should be set to current date formatted as [yyyy-MM-dd]


Problem: Xquery misses good support for formatting dates but offers current-date() function

Resolution: Use XSLT to postprocess custom XML tags (instructions) in own namespace

XQuery snippet:


XSLT to postprocess xquery result:

Wednesday, April 4, 2012

Using breadth-first-search to solve dynamic programming problem

The previous implementation was using depth-first-search and might be improved by switching to breadth-first-search.


Dynamic Programming to the rescue

In my previous blog post I crafted a solution together that created a graph of all possible paths starting from the initial node. Next I filtered out the ones not ending at the goal node. For the paths that were left over I finally calculated the total path cost. As the number of lanes increases you will see that the number of possible paths grows exponentially and my previous code will run into timeouts.

So now i present the same solution but drastically improved using dynamic programming.



Tuesday, April 3, 2012

Practising Python skills

I recently followed 2 courses from Udacity:
  • CS101: Building a search engine
  • CS373: Programming a robotic car

Both courses used Python as programming language and I have to admit that it has a low-barrier to get started. I'm currently thinking about a programming assignment which can be solved in a number of ways:

  • Dynamic programming
  • A*
  • ...

For the time being here is some experimental code that will be heavily due to changes ;-)







Tuesday, March 27, 2012

Jackson now also supports XML

Today i had the need again to serialize some plain old Java objects to XML. Having worked before with XStream and Castor I looked into Jackson today.
I quickly discovered that the ToXmlGenerator could be setup by calling the createJsonGenerator() on the XMLFactory. That sounds strange at first and it was no surprise to me that due to this fact you have to write a lot of boilerplate code to get the job done. For serializing JSON you have to know if you are dealing with booleans, strings, numbers and so on. It's all about getting the Quotes "" right. For XML this is not the case however and I immediately started writing a JacksonXMLGeneratorHelper to ease the pain. Not only did this reduce the LOC count drastically but i think the code became also much more readable.





The output from the two unit tests produces the same result as expected.

Thursday, December 15, 2011

NLP exercise: convert back a rotated String

This week we had a little assignment where we receive a text as input. But all letters in the text are rotated by some number. Our mission is to find out what the original text was.
Below follows a naive approach by just applying brute force and shifting all letters for a range of 1 to 25.

TextRotator trait:


The test script:


Output of running the NLPTest:


The solution can be found on lines 16.

Wednesday, November 30, 2011

Work in progress: Reflections on reusable widgets

Just thinking out loud and sharing some ideas. The reason i'm doing so is to get feedback on twitter or through comments. I have no clue yet how powerfull existing widgets out there currently are so i'd like to get more insight from people who do. And i think the high level ideas from below widget definition speak more then words.