DBSight 2.0 released!

Submitted by chris on Fri, 2008-07-11 16:40.

This thread will talk about some interesting new features. First of all, Batching Subsequent Queries.

We are working on DBSight 2.0, which should improve dbsight on many fronts. One of the improvement is the option to Batching Subsequent Queries.

Previously, you have a subsequent query like:

 select * from comments where article_id = ?

It'll fetch all comments of an article. But this query will run for each article. DBSight needs to send the same query to the database.

Now, you can write the query like:

 select * from comments where article_id in ( ? )

DBSight will recognize this pattern, and expand the query to:

 select * from comments where article_id in ( ?,?,?,?,? )

when batch size is 5.

This should increase crawling performance a lot for some cases, especially when you are crawling a database far away from DBSight.

How does it sound?

And do you have some other suggestions? Please let us know when the feature set is still open.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Submitted by Jeff on Sun, 2008-09-21 21:45.

I've just installed the beta and it seems to be quicker and more reliable when creating indexes.

I did have one problem when searching, when I requested the results ordered by price I got no results.

By playing around I realised that in the past when sending the search request to dbsight I would use "sortBy" in the url, but version 2 only works if it's all lower case, ie "sortby". It's still showing as "sortBy" under the Application Integrate tab.

Submitted by chris on Sun, 2008-09-21 22:23.

I just verified DBSight 2.x still uses "sortBy". Actually I checked all the source code and didn't find any place that's using "sortby".

I am suspecting you could be doing something unexpected, since if you send ordering by "sortby", it'll be ignored by DBSight. Although the documents are returned, they are not sorted.

Please verify the "price" column is set to "Keyword" type, and is configured to "Sortable".

Submitted by chris on Thu, 2008-09-11 11:16.

I have to say I am impressed with Google Protocol Buffer. It's actually pretty easy to use, and most of all, very efficient binary data transmission.

We added a Java API for other java programs to directly search on DBSight. I had tried the XML approach, which is a customizable template, but the performance is so so, since it needs to transmit a lot of data, especially when empty query match all results, the data for narrowBy(facet search) is huge.

When we switched from XML to Google Protocol Buffer, no specific benchmarking, but it just feels much faster.

And this opens the door for other languages to talk with DBSight!

Thanks Google for sharing this!

Submitted by chris on Sun, 2008-09-07 15:38.

Just added another very useful feature based on an idea of James from Costco.

Basically, keep the old values of database connection info, and scheduling, when uploading an index configuration to overwrite an existing one.

This is great if you have production, staging, testing, and development instances laying around, and need to move configured index configurations through all the steps. This would also be great if you have a cluster of searching and indexing nodes and need to synchronize a small change like adding a column and changed templates.

Submitted by chris on Tue, 2008-09-02 20:30.

Actually not a "feature". It's remembering the column's setting when updating the SQLs. Thanks Ken from UC Berkeley for this great suggestion!

I am suffered from this problem also. Usually, if you update the sql, the columns' settings are back to default. This is trivial if you don't do it often, but very annoying if you need to adjust the SQL now and then, but forget the column's configuration done long time ago.

Now is much better. All columns' setting are remembered. Available right now in DBSight 2.0 beta!

Submitted by chris on Mon, 2008-08-18 23:39.

Finally, it's kind of ready after smoothing many rough edges.

I would expect some NullPointerException, or strange errors. Just let me know and we will fix it quickly.

One important note! Previous index configuration are not compatible. Actually they are kind of compatible, but not fully, and not fully tested either. So, it's better to start with a clean one.

The downloading links are

Submitted by chris on Thu, 2008-08-07 18:21.

Now we use "scaffolding" this name since Ruby on Rails make it very popular.

DBSight 1.x has a working-well scaffolding system, but the generated code is not clean enough. It was based on a primitive regular expression substitution.

With DBSight 2.0, scaffolding will be much clear to use. And even better, users like you can easily create one, and re-create templates based on your own scaffolding!

This is due to the Freemarker language, which is very similar to Velocity, yet more powerful. One of the important feature is "escape", which you can escape Freemarker grammar inside a Freemarker page. This paves the way for super flexible scaffolding.

So, learn some Freemarker, it's kind of similar to Velocity and really easy since you don't really need to know all the grammars. Just basic ones like "foreach", "if/then/else", string formatting.

One more thing about result template: you can use specify a file to render now. Previously you can only use main.vm of a template. Now you can issue search and specify not only template, but also file, like


This way, you can have one bundled template that can render several things. It'll be great for AJAX stuff.

Or, you can even use some jsp to render some results, for example, suppose you have a pdf.jsp to render results.


The javadoc of the SearchResult is also provided. So you should free to go anywhere.