PhantomSQL released

Finally I have released my PhantomSQL project.

PhantomSQL is a domain specific language designed for mining content from static and dynamic sources, It closely resembles SQL with features borrowed from other popular dynamic languages.
It can be run as a interpreter or ‘server’ mode, it comes with type 4 JDBC driver for ease of integration with java applications.

Sample Queries

Here are just few examples taken from the project site to display some of the syntax of the language.

Hello World

Following example illustrates how to query google.com for some blog search results.

 
 SELECT FIRST css("#ires a") AS title FROM https://www.google.com/SEARCH
  USING GET WITH {'q': "josh bloch", 'tbm':"blg"}
 
  print @title +" : "+ @title['href']

Crawling Flicker.com

Flicker Integration
This query does nothing more than query flicker.com for ‘nabilishes’ and then crawl the site using GET while the ‘css(“a.Next”)’ condition matches, at the end it prints how many results have been found.

 @RESULT = SELECT css(".pc_img") FROM http://www.flickr.com/SEARCH 
 USING GET WITH {'q': "nabilishes", 'm' : "text"}
 crawl(css("a.Next")) 
 print @RESULT.length()

Extracting images from Flicker.com search results

Here we build on previous example by adding the ‘WHILE-SELECT’ construct and actually saving the image with the ‘save’ function.

while SELECT css(".pc_img", "src", TRUE) AS img FROM http://www.flickr.com/SEARCH  
       USING GET WITH {'q': "nabilishes", 'm' : "text"}
       crawl(css("a.Next")) 
 BEGIN
   save (@img)
 END

Retrieving Binary Content

Following two examples are equivalent.

Following example illustrates how to retrieve binary content from a site and save it on the filesystem when running via interpreter.

 
SELECT FIRST css("#main_image", "src", TRUE) AS item FROM https://1saleaday.com
save(@item)

Following example illustrates how to retrieve binary content from a site with JDBC Driver and dump the file to file system.

    public void retieveFile()
    {
        try
        {
            try
            {
                Class.forName("com.gbltech.phantomsql.driver.PhantomDriver");
            }
            catch (Exception e)
            {
                e.printStackTrace();
            }
            Connection conn = DriverManager.getConnection("jdbc:phantomsql://localhost?characterEncoding=utf8");
            Statement statement = conn.createStatement();
            ResultSet resultSet = statement
                .executeQuery("select first css(\"#main_image\", \"src\", true) as item from https://1saleaday.com");
            if (resultSet.next())
            {
                OutputStream out = null;
                Blob blob = resultSet.getBlob(1);
                try
                {
                    out = new FileOutputStream(new File("./test.jpg"));
                    InputStream is = blob.getBinaryStream();
                    byte[] buff = new byte[1024];
                    int read = 0;
                    while ((read = is.read(buff, 0, buff.length)) != -1)
                    {
                        out.write(buff, 0, read);
                    }
                    out.flush();
                }
                catch (FileNotFoundException e)
                {
                    e.printStackTrace();
                }
                catch (IOException e)
                {
                    e.printStackTrace();
                }
                finally
                {
                    if (out != null)
                    {
                        try
                        {
                            out.close();
                        }
                        catch (IOException e)
                        {
                            e.printStackTrace();
                        }
                    }
                }
            }
        }
        catch (SQLException e)
        {
            e.printStackTrace();
        }
    }

This is just a taste of what the PhantomSQL can do, there is much more so go check it out.
I am looking for feedback, criticism and bug reports to let me make the project better, so if you have something drop me a line.

Leave a Comment

Your email address will not be published. Required fields are marked *