MikeTeo.net

A Software Technologist's Blog (Wanna Email Me?)

Python Mutant Tester 0.1.0 Released

November 27, 2011 By miketeo

I have just released Python Mutant Tester (PyMuTester) version 0.1.0 to the public. More details can be found at the Python Mutant Tester project page.

PyMuTester is a testing tool to facilitate mutant testing in Python applications. By making small (and syntactically correct) modifications to the application’s source code and re-run the unit tests over these mutated source code, we can uncover missed checks and loopholes in the test cases.

Read the rest of this entry »

JSON Functions for PostgreSQL

November 5, 2011 By miketeo

Hi, I have released a PostgreSQL contrib module for encoding database rows into JSON structures.

More information can be found at the JSON Functions for PostgreSQL project page.

PostgreSQL Extensions for CLIPS Expert System

October 30, 2011 By miketeo

Hi, I have released a patch to the CLIPS (and also pyclips) expert system which allows CLIPS users to be able to send SQL queries to a PostgreSQL database from the CLIPS environment.

More information can be obtained at the PostgreSQL Extensions for CLIPS Expert System project page.

Large-scale server systems are common in this age. A small computing grid can easily consist of a few hundred nodes. Configuring such a server network can be tedious and usually requires a centralized change management system. If the centralized change management system went down, configuration/updates on the nodes can be affected.

By establishing a standard in your DNS TXT records, we can embed information such as:

  • Remote upstream server’s hostname
  • Listening TCP ports on this hostname
  • Available services (in the form of a bitmask) on this server

For instance, we have a node with a hostname called node123.abc.com which connects to job1.abc.com to communicate information. We can embed the upstream hostname information (job1) in the DNS TXT record for node123. We will also embed the TCP listening ports and the type of services available in job1’s DNS TXT record. Now, the client software on node123 only needs to retrieve node123’s DNS TXT and learns that it needs to connect job1 as its upstream server. It will then retrieve job1’s DNS TXT record and learns of the listening TCP ports on job1 machine that will listen for incoming TCP connections.

Advantages

  • Resilient distribution mechanism built on top of DNS infrastructure with DNS caching and secondary DNS servers.
  • Centralized configuration node can be “hidden” behind a firewall without being exposed on the Internet.
  • Zero-configuration on the nodes as all configuration information could potentially be learnt from the DNS TXT records, so a single installation base can be utilized for all computing nodes. In fact, in theory, you will only need to configure the node’s hostname and IP network information.

Disadvantages

  • Delay in changes from configuration updates to actual change implementation on the server (due to DNS cache)
  • Potential leaks in configuration information as anyone can “lookup” the DNS records.
  • Often requires significant changes in client source code to use DNS TXT records to learn about its connection configuration
  • Limited by the small data size (approx 250 bytes) in the DNS TXT record specification.

My Experience with Using CLIPS

March 25, 2011 By miketeo

(A chinese translation of this post is available here)

I had the opportunity to incorporate a CLIPS expert system in one of my recent projects for bill plans logics and monitoring of health of our system modules. Both of them are classical sore-points for OO/procedural programming methodologies. We had made an attempt to implement the bill plan logics in an initial version using python, but it ended up in a meshed-up code with half-dozen levels of nested-if-then-else control structures. The system would have eventually ended up to become an excellent case study for project maintenance failure if we had persisted in using an OO/procedural language for the implementation.

Even though CLIPS has worked well for us, I still feel that it may not be suitable for all projects.

Requires radical change in programming paradigm.
Instead of executing your operations in a procedural manner, you have to “train” yourself to re-think your operations as separate rules; these rules operate in tandem in a series of recognize-act cycles. When a group of rules matches, a series of actions can be performed which may change the conditions of the rules in some manner. This may lead to other rulesets (or even the current ruleset) being triggered.
Another issue you would face is how to get these rulesets to “fire” in the order that you want without imposing too much restrictions on them. The conditions in each ruleset should not depend on previous states of other rulesets, i.e. they should not be coupled to each other.

Requires deep understanding of the knowledge domain.
To model the rules, you will need to arrange your knowledge on how things operate in your problem domain into a series of rules/patterns. Usually, you can attempt to perform this knowledge modeling process yourself, or to engage a knowledge modeling expert to assist you.

Execution should be data-driven or pattern-driven.
If you encounter a need to perform lots of if-conditional statements on many variables, the execution process is most likely data-driven or pattern-driven. Any non-trivial knowledge domain usually involves more than a dozen variables to work with. In OO/procedural, this means your source code will end up in meshed-up manner with multi-levels of nested if-else statements. CLIPS language syntax allows you to specify the conditions (i.e. rules) for a group of tasks to be executed in an organized manner.
However, one must be careful to plan and organize the data structure of the CLIPS facts and classes. Every datum in each CLIPS class and facts should be well-encapsulated using the same data-encapsulation principles in OO development methdology).