If you're interested in functional programming, you might also want to checkout my second blog which i'm actively working on!!

Thursday, December 15, 2011

NLP exercise: convert back a rotated String

This week we had a little assignment where we receive a text as input. But all letters in the text are rotated by some number. Our mission is to find out what the original text was.
Below follows a naive approach by just applying brute force and shifting all letters for a range of 1 to 25.

TextRotator trait:


The test script:


Output of running the NLPTest:


The solution can be found on lines 16.

Wednesday, November 30, 2011

Work in progress: Reflections on reusable widgets

Just thinking out loud and sharing some ideas. The reason i'm doing so is to get feedback on twitter or through comments. I have no clue yet how powerfull existing widgets out there currently are so i'd like to get more insight from people who do. And i think the high level ideas from below widget definition speak more then words.



Tuesday, November 29, 2011

Higher order declarative XML and dynamic XSLT

I've been playing with the idea that when you hook a few advanced technologies together you could build some nice stuff. The things that came to my mind are:
  • playframework 2.0
  • Scala
  • Apache Cocoon 3.0
  • latest version of saxon
  • a Database to store application resources (XSLT, templates, images, ...)




The idea I had in mind was to create reusable widgets and I've been reminded today that such things are already in the market. However, it's hard for me to compare ease of use, performance and usability for commercial businesses. In the meantime I started experimenting a bit and came up with the following widget which basically generates the same output (table containing tweets for particular person) as in my previous article. However, in the previous article no higher order XML declarations were used. I think you will see that with some XSLT magic you get a lot done with a few lines. It took some time to get it working properly but the table widget now takes an @data-source which you can bind to any table:component.

By the way, with higher order declarative XML i am referring to the analogue higher order functions like e.g. MAP / REDUCE, FOREACH. As you can see from the sample below, a single table:row instruction can actually be seen as a function which maps a sequence (of nodes selected with the @bind) to table:rows.

The complexity about getting the widget below to work was thinking out-of-the-box: Generate a XSLT on the fly using XSLT which would do all the heavy lifting.

Using client side XSLT with Saxon-CE

Yesterday I decided to have a go with Saxon-CE as I definitely can see benefits from doing client side transformations. Saxon-CE is still in alpha state so it is not recommended to use it in production yet.
I wanted to really come up with a nice demo so decided to check if I could build a webpage using live Twitter Search data. I did face my first issue as the @data-source attribute is not working cross-domain and it does not even seem to be working in chrome or IE. I had to manually download the XML results from a Twitter search for Michael Kays tweets. They are included in this downloadable zip but to get an idea of what the data looks like you can open following url: Michael Kay's tweets

The final result can be seen in screenshot below:

Friday, November 25, 2011

Introduction into functional programming with Javascript

Some people underestimate the powerfull concepts of Javascript. It is the ideal language to start understanding 'functional programming'. To better get a grip on different ways to accomplish the same task I wrote some basic and more advanced code from which I hope some people can benefit.
To see what the actual scripts do, just save each script and name it e.g. script1.js, script2.js and include them like below in the html page. Just load the page in your browser and you will see output printed to the console.








Friday, November 18, 2011

Apache Cocoon: From XML to JSON as datasource for client or server side Javascript

In a Cocoon project I am working currently working on we are using Sedna XMLDB. All the information we use is stored as XML and I just wanted to share some insight in several use cases we
are handling. Just as a side note, we are using a custom XQueryGenerator to fetch all data but the use cases below also match for a file repository.

Use case 1: publish content from a single XML source (HTML/ PDF/ ...)
---------------------------------------------------------------------
As we only need 1 single source file it is often easier to use XSLT to generate the final HTML as XSLT offers more advanced features then XQuery as we speak. To name a few
examples, XQuery lacks good out-of-the-box support for:
- grouping
- formatting

Also you have to use another mindset when programming with XSLT. In XSLT you basically declaratively tell WHAT to do when the XSLT processor matches a specific
node or attribute. So you don't have to take into account all possible combinations that might occur. Whereas in XQuery you have to state explicitly what you want to produce.


Use case 2: publish content from multiple XML sources (HTML/ PDF/ ...)
---------------------------------------------------------------------
While Cocoon does allow you to get the job done using e.g. the cinclude transformer or the aggrate generator it becomes rather messy very quickly.

And there are many ways to accomplish the same result. You might choose to write several pipelines, each one generating and extracting (xslt) the data needed and aggregating those results.
Or you could just aggregate all sources in 1 go and write a single xslt to extract the data.

For this use case however it is much more convenient to just write 1 single XQuery which is able to generate whatever your needs are.


Use case 3: perform some conditional logic based upon the source XML before processing it
-----------------------------------------------------------------------------------------

Let's say the XML source either describes a Male or Female and you can find out by fetching the content of Gender tag.

Again there are multiple ways to Rome and to just name a few. Write a Java Component which extracts the data you need and include it in your flowscript. But how easy is this generalizable? what methods should the interface contain? Different use cases might call for different interfaces and you don't want to end up creating new java classes all the time.

Use case 4: you want to use some fancy Javascript widget which needs a JSON datasource
-----------------------------------------------------------------------------------------
If all your data would be coming straight from a relational database, your entities could be easily mapped to JSON. There are plenty of libraries out there. But how do we generate JSON from XML in particular?


To be able to solve use cases 3 and 4 I decided to come up with a XML dialect which could represent JSON and also do some formatting if needed.

Below a sample representation of the JSON-XML dialect.



Generic transformer (XSLT2.0) which generates a JSON string from the input




A sitemap example generating JSON from XML


Now we can invoke the uri 'data2json/employees/1234' and the corresponding JSON will be generated which can be used on the server and client side.

An example of how we can use this from flowscript:

Wednesday, October 26, 2011

Code vectorization examples

To greatly reduce complexity and even benefit from greater performance often a technique called 'code vectorization'.


Example 1:
----------


Let's check the outcome of the vectorized formula:


Example 2:
----------

Tuesday, October 25, 2011

Using higher order functions with Octave

I was working on implementing the sigmoid exercise from the MachineLearning class. The assignment stated the following requirement:'Your code should also work with vectors and matrices'. But I already sensed that this is calling for using higher order functions.

The sigmoid function basically is defined as


Now you could of course write some for loops each time you want to apply a function to each element of a matrix, it makes more sense to write a generic mapper function which takes two parameters: a matrix (or vector) and a function to be applied to each element of that matrix. This mapper function then instead returns a new matrix. First thing to do was checking if Octave supported higher order functions. It turns 'feval'comes to the rescue.


Now for demonstrating this works i wrote a simple function which doubles its argument:


Now from the command line:




As Davy Meers pointed out Octave also supports anonymous functions which are even more powerful.

Monday, October 24, 2011

Best practices XSLT

I'm pleased to see that my own findings in programming with XSLT match the ones from Andrew Welch:
  1. don't use xsl:for-each, use xsl:apply-templates
  2. don't use named templates, use xsl:apply-templates with a mode
  3. if a template just contains a choose/when, separate out the branches into individual templates
  4. instead of xsl:value-of use xsl:apply-templates
  5. one large template is bad, lots of specific small templates is good
Regarding (4): xsl:value-of will create a text node in the result tree, xsl:apply-templates will ultimately apply templates on the text node where the default template will kick in and copy it to the result tree... so the effect is the same, however in the latter case you are making it available for overriding.

so for example:

you could do:



or you could do:


both will output the same result.

Now if a requirement comes in to wrap the type in <b>, but only if its 'beer', if you have used apply-templates you can just add the
template:


...and then go home. However if you have use value-of earlier on,
you would need to go and modify that template first to change it to an apply-templates anyway.



Friday, September 30, 2011

The power of Apache Cocoon, Xquery, and XSLT extensions

This article describes the use case where you have product data stored in a XML database.
Your customer wants to be able to search products based on (part of) the product name and display
the product properties in a html page. However, all products are also part of a workflow and to find
the status of a particular product we have to fetch information from another system. To fetch the status
this demo includes a MonitorClient which offers needed functionality. The demo describes how to generate the
entire page using purely Apache Cocoon, Xquery and XSLT. The only custom Cocoon component we developed is
the XQueryGenerator which basically reads an xquery from the specified @src attribute, injects any sitemap parameters from the match pattern and returns the results.


PH3330L.xml: Sample data which is stored in XMLDB:


products.xqlib: XQuery library for retrieving products:



products.xquery: returns a xhtml page containing matching products based on their name:









Beans configured in Spring application context:




client.xslt: replaces client:getStatus with actual value

Wednesday, September 28, 2011

Linux nice-to-know commands

How to retrieve fully qualified hostname?


How to copy file from one machine to another?


How to tail the last 100 lines of a logfile (in combination with more command)?

Sunday, September 11, 2011

Benchmarking HTTP requests using ApacheBench

After watching a presentation from Ryan Dahl about NodeJS I learned one nice benchmarking tool called ApacheBench. So I decided to give it a go on the simplest nodejs server.

Friday, September 2, 2011

Quartz Jobscheduler dashboard

I have a little project running at my customers site which basically is nothing more then a bunch of CRON jobs deployed as a Webapplication. As 1 Cron job seemed to fail from time to time I wanted to get some more insight into what triggers are defined, a way to pause or resume those triggers and check which jobs are currently running.

Here are some screenshots of my first prototype dashboard:









Wednesday, July 6, 2011

Unit testing XQJ and Saxon

I have to admit that it took me quite a bit of time to get this unit test working. All these namespaces don't make life easier. But let met explain what is going on in the code snippets below. First I wrote a little module which has 1 function that returns the groupId as a string. Luckily saxon did support the "at" hint for importing modules. But it took me quite some time to understand what the base-uri was used by saxon. Default it seems to be "" but it somehow knows how to resolve it to 'file:///c:/development/workspaces/cocoon3/cocoon-xmldb/'.



module containing 1 function that extracts groupId from pom


pom_module.xquery which imports pom.xqlib module and outputs groupId



Tuesday, July 5, 2011

Unit testing XQJGenerator

My first unit tests would only work if I had an xmldb up and running. That was not really convenient. So I looked around for a XQJ implementation that I could use from my unit tests. Saxon9-HE came to the rescue.

I added following dependency to my pom.xml after installing the jar in my local maven repo




test-application-context.xml:


xquery for unit-test: splitting a sentence into words



xquery for unit-test: retrieving all ASF members from pom.xml


unit-test: transform a plain string of words into xml output


unit-test: return the names of all ASF-members of pom.xml

Using XQJ API with Cocoon3

As my research pointed out XMLDB was a bit painfull to use as most xmldb implementations had quite specific API's to accomplish the same result. I instead turned my focus to the newer XQJ API which offers a JDBC alike approach to using XQuery.

Today I wrote a first draft version of an XQJGenerator accompanied from 2 basic unit tests. Although you have to be carefull that not all XQueries can be ported from XMLDB API to XQJ i managed to get the results extracted the way I wanted to.

XQJGenerator.java


XQJGeneratorTest.java


xquery1.xquery


demoboards1.xquery


test-application-context.xml


xquery1.xquery output


demoboards1.xquery output

Friday, July 1, 2011

Comparing usage of XMLDB Services

As I want to be able to use all XMLDB API services from within Cocoon a first task is comparing in what aspect usage differs accross 'Service' and 'XMLDB'.

Any good observer can see that it's possible to abstract this a bit further:

Using XpathQueryService with BaseX:




Using XQueryService with Sedna:

Cocoon 3.0 and XML DB API

Yesterday evening i checked out the latest sources from Cocoon3.0 and started diving into the source code to get a feeling of how things work in this new version. As Jeroen Reijn already pointed out in this article it only takes you half an hour to be up and running building your own custom pipeline.

The company where i'm currently working at recently started using an XML DB (instead of storing files on a filesystem) based upon my advice. We already were using a custom XQueryGenerator for Cocoon2.2 but as a little exercise I quickly hacked some code together based upon Cocoon3. The actual code of the XQueryGenerator is not yet worthy to be shared but I just wanted to share the results of 2 little unit tests.



demoboard module:


demoboards1.xquery:


demoboards2.xquery:


Both Xqueries should return the same result which they do.

Tuesday, June 28, 2011

Finished some restyling of this blog

Ok.. I just received news of Google's +1 button which is kinda cool to get noticed so I decided to hack it in. Because i again played around with the template designer of my blog my previous customizations got lost. Nothing to worry really, only SyntaxHighLighter was not working anymore and my code snippets magically vanished. So I decided to switch to using the latest and greatest version of SyntaxHighLighter and use the autoloader which was a new feature. I also am using the hosted version of the css and js files which is easier.

One side note: I had to make sure syntaxhighlighter got bootstrapped only after the dom was loaded. So I used YUI3's onDomReady event to call a bootstrapSyntaxHighLighter function i wrote.

Tuesday, June 14, 2011

Mimicking the Switch Expression in XQuery1.0

Here follows a little trick I picked up. Sometimes you want just like in many other programming languages to handle several use cases depending on a switch expression. This is not possible for XQuery1.0 by default but that does not mean you can't mimic it using the typeswitch expression ;-)



Output:

Complex Grouping with Xquery ...the general pattern

Testdata:


The idea is that we group all items so we get combinations of chapter with resp. paragraphs. However, some paragraphs don't have a parent (chapter). In this case those paragraphs should be handled first.


Expected output:


Well, here is kind of a generic solution (read solution pattern). You need 1 function that extracts the groupingkey from an item. The getGroupingKeys function just takes the distinct values of all grouping keys. So when you have the groupingkeys, you can determine the items belonging to each group using this predicate [local:getGroupingKey(.) = $grouping_key]. Next you handle the items for each group the way you want.

Generic Join function (XSLT/ Xquery)

In case you need a very high level join function for a sequence of items, I just wrote one which works very nicely.

Thursday, March 24, 2011

Portable GIT for windows

I was facing issues setting up my git account the other day because my default user.home was pointing to a H: drive on which I only had read rights. Really helpfull if you ask me. Anyway... i found this nice article of how to create a portable GIT config.

Wednesday, March 16, 2011

Building JSON webservices with Sedna, XQuery, Apache Cocoon and XSLT2.0 (part 2)

Next I needed to come up with a generic xslt that would transform the Javascript-ish xml representation to JSON.

So after some struggling i ended up with following solution:



So with input below I would get

output below:

Please pay attention to the fact that I even took care of supporting formatting date(time)s and numbers with my implementation.

Another more complex example

will result in

Building JSON webservices with Sedna, XQuery, Apache Cocoon and XSLT2.0 (part 1)

Goal of project: Easily allow webapps within our intranet to fetch data from XML DB (Sedna) in JSON format.

First I wanted to come up with a XML schema to represent JSON data. So after some trial and error I ended up with following schema:


Some concrete examples of above schema:










More info available at Part 2

Wednesday, February 23, 2011

Xquery exercise

This week I had to come up with a temporary solution to convert a dita map inclusive it's topics into our currently used XML schema for Value Propositions. I just blog the query because it's a nice reference material. The only parameter I inject from Java is the mcvpId just in case you are wondering where that variable gets its value from.

Thursday, January 6, 2011

Tunneling parameters with XSLT2.0

As you can see the percentages in the xml snippet before transformation add up to 99.90%.
The customer wanted to make a correction on the first maximum percentage so that all percentages would add up to 100%.
So this means adding 0.10 to the "lead" material percentage. A little inquiry on the xslt mailinglist showed me how to solve this elegantly.
Special thanks go to David Carlisle for the feedback.

BEFORE:


AFTER: