Generate ID from UUID

This is a method to generate a long id in the positive space.

There are few issues to consider with this method
– UUID is 16 bytes / 128 bits
– Long is 8 bytes / 64 bits

This means that we will loose some information, if we don’t want to lose that we could use a BigInteger but In this case we are dealing with longs.

     * Gnereate unique ID from UUID in positive space
     * @return long value representing UUID
    private Long generateUniqueId()
        long val = -1;
            final UUID uid = UUID.randomUUID();
            final ByteBuffer buffer = ByteBuffer.wrap(new byte[16]);
            final BigInteger bi = new BigInteger(buffer.array());
            val = bi.longValue();
        } while (val < 0);
        return val;

This works simply by creating new BigInteger from parts of UUID object, and then getting the longValue. We also make sure that the ID is in positive space, if its not we simply repeat the process. During testing most cases completed in one iteration but it did encounter few runs that reached four iterations.

Configuring Java JDK on Ubuntu

This is an easy way to configure java on a linux box, all this information is available online.

First we need to obtain the build.

sudo wget

Extract from tar

 sudo tar xzvf  jdk-7u75-linux-x64.gz

Create symbolic link so we can later update the version

sudo ln -s /opt/jdk1.7.0_75/ /opt/java

we edit the /etc/profile and add following two lines

export JAVA_HOME=/opt/java
export PATH=$JAVA_HOME/bin:$PATH

finally we ‘source’ the file

source /etc/profile

At this point you should be ready to go, we can verify this by executing

java -version

Starting jetty via command line an nohup

Somehow I am getting problems starting Jetty via

service jetty start

We will be using unix command called nohup
“Nohup is a unix command, used to start another program, in such a way that it does not terminate when the parent process is terminated.”

I have opted out for using this

nohup java -jar start.jar -Djetty.port=8085

while this works it shown an message

nohup: ignoring input and appending output to `nohup.out'

to fix that up we need to redirect in put and output to /dev/null

 nohup java -jar start.jar -Djetty.port=8085  /dev/null &

Taking heap dump of java process on linux and windows

Taking a heap dump from console when Java VisualVM and JMX is not available to us.
We will use following tools

    • jmap
    • jps
    • ps

Dumping heap requires two steps
1) Obtaining target process id
2) Dumping heap for given pid

First we need to obtain the target process id we would like to dump, here I will show couple ways I like to use.

ps aux | grep 'java'
userx     29901  6.7 47.0 25418812 3848276 ?    Sl   Mar23  85:42 /opt/java/bin/java -Djava.util.logging.config.

Here second column indicates our process id (pid)

Second method that is quite useful to obtain pid for java processes

uxserx@WS4:/opt/java/bin# ./jps -l
29901 org.apache.catalina.startup.Bootstrap

As we see both methods returned us pid of 29901
Npw to perform the dump we issue our second command

userx@WS4:/opt/java/bin# ./jmap -dump:format=b,file=/tmp/heapdump-001.hprof 29901
Dumping heap to /tmp/heapdump-001.hprof ...

At this point we have our heap dump that is ready to be analyzed, for my analysis I use two tools. Eclipse Memory Analyzer (MAT) and Java Visual VM

Accessing data of leptonica PIX data

This is mainly as a reference

 * Get Pixel value at given  point
l_uint32 pixAtGet(PIX* pix, int_t x, int_t y)
    l_int32 wpl    = pixGetWpl(pix);
    l_uint32* data = pixGetData(pix);
    l_uint32* line = data + y * wpl;
    l_uint32 value = GET_DATA_BYTE(line, x);
    return value;

To set a pixel value we can use this

 * Set Pixel value at given  point
void pixAtSet(PIX* pix, int_t x, int_t y, byte_t value)
	l_int32 wpl     = pixGetWpl(pix);
	l_uint32* data  = pixGetData(pix);
	l_uint32* line  = data + y * wpl;
	SET_DATA_BYTE(line, x, value);

Tokenizing/splitting string in c++

This method uses strtok to tokeninze our string given a specific delimeter, results of that are put into supplied vector. There are few other ways we can do this but this one is straight forward.

#include <iostream>
#include <string>
#include <string.h>
#include <memory>
#include <stdlib.h>
#include <stdio.h>
#include <list>
#include <vector>
using namespace std;
void split(vector<string>& out, const string& in, const string& delim)
  char* lc = (char*) malloc(in.size());
  strcpy(lc, in.c_str());
  strtok(lc, delim.c_str());
  while (lc)
      string s = lc;
      lc = strtok(NULL, delim.c_str());
int main(int argc, char* args[])
  string str = "apple,organge,cherry";
  vector<string> o1;
  split(o1, str, ",");
  for (int i = 0; i < o1.size(); ++i)
     cout << "token = " << o1[i] <<endl;
  return 0;


Supplied string : apple,organge,cherry
Delemeter : “,”

  • apple
  • organge
  • cherry

Calculating partial Hausdorff Distance

struct Point
	Point(int_t _x, int_t _y) : x(_x), y (_y)
	int_t x;
	int_t y;
typedef std::list<Point*> points_t;
double euclideanDistance(const Point& lhs,const Point& rhs)
	 double p1 = std::pow((float)(rhs.x - lhs.x), 2);
	 double p2 =  std::pow((float)(rhs.y - lhs.y), 2);
	 double vd =  std::sqrt(p1 + p2);
	 return vd;
double hausdorffPHD(points_t seta, points_t setb)
    double maxDistance = 0;
    points_t::iterator afront = seta.begin();
    points_t::iterator aback  = seta.end();
    std::vector<double> ranking;
    for(int_t i=0; afront != aback ; ++afront, ++i)
    	Point* a = *afront;
        double minDistance = std::numeric_limits<double>::max();
        points_t::iterator bfront = setb.begin();
        points_t::iterator bback  = setb.end();
    	for(; bfront != bback ; ++bfront)
    		Point* b = *bfront;
    		double ed = euclideanDistance(*a, *b);
            if (ed < minDistance)
                minDistance = ed;
    std::sort(ranking.begin(), ranking.end());
    double fraction = .7;
    int k = (int) (seta.size() * fraction);
    return ranking[k];
double hausdorff(points_t seta, points_t setb)
    double habPHD = hausdorffPHD( seta, setb);
    double hbaPHD = hausdorffPHD( setb, seta);
    double distancePHD = std::max(habPHD, hbaPHD);
    printf("hd = %0.4f\t %0.4f\t %0.4f\t \n", distancePHD, habPHD, hbaPHD);
    return distancePHD;
int_t main(int_t argc, char_t** args)
	points_t seta;
	points_t setb;
	seta.push_back(new Point(1,2));
	seta.push_back(new Point(2, 4));
	setb.push_back(new Point(2, 4));
	setb.push_back(new Point(3, 4));
	double val = hausdorff(seta, setb);