chris's blog

DBSight helps not to reinvent the wheel

Submitted by chris on Thu, 2007-06-07 12:48.

Mike has a great usage of Lucene http://blogs.atlassian.com/rebelutionary/archives/2007/04/my_serverside_java_symposium_2007_presen.html

First of all, fantastic effort!

But I see from the presentation that he is having problems for large Lucene indexes, and he wants to resolve it using Eden space strategy, and a pretty complicated strategy for clustered Lucene index .

DBSight is using Eden space strategy already, and uses a simple and robust Lucene index replication mechanism for clustering. It requires a dedicated Indexing server, but it easily relieves all other nodes' heavy Indexing process.

Lucene + JRuby + Database, DBSight has ruby scriptiing!

Submitted by chris on Wed, 2007-04-18 11:14.

In the latest beta, we have added scripting capability, and you can easily customize it to whatever you want!

For example, if you want to make a copy of the index, or ftp it to another machine, or send an email to anyone.

Sky is the limit! More details will be posted on the wiki.

Instant Drupal Database Search by DBSight

Submitted by chris on Mon, 2007-02-26 11:14.

Hi, I have spent some time to learn the Drupal's database schema. It's pretty simple. I can create a dbsight search just by these two SQLs.

Main Query

 SELECT nid, type, title, changed, promote, body FROM node

order by changed desc

Subsequent Query

 SELECT name, CONCAT( subject, comment) as comment  FROM comments where nid = ?

The full information and downloadable index configuration can be found here:

 http://wiki.dbsight.com/index.php?title=Application_search

To have your own drupal search, you just need to download the index configuration, and change the database connect

configure Velocity to use Log4j in DBSight

Submitted by chris on Mon, 2006-11-06 12:41.

The Velocity verbose log errors are very annoying. I was tired of it also, but just let it be. Several people are complaining about it recently. So I took 10 minutes to look though the web, and found the trick.

Super simple, just edit velocity.properties file, add one line there

 runtime.log.logsystem.class=org.apache.velocity.runtime.log.Log4JLogSystem

Right. That's it. Finally Velocity is quiet now.

DBSight Index Configuration Upload/Download

Submitted by chris on Wed, 2006-09-13 20:45.

Now you can easily port existing index configurations to another DBSight instance. Just choose download on the dashboard, the index's defintion, and related templates will be zipped for downloading. When uploading, it'll just be there and you don't need to restart.

This will be grealy helpful for deploying index created in development environment to production environment.

It's also very helpful for multi-server mode. You can visit http://wiki.dbsight.com/index.php?title=Remote_Index_Replication for more details. You can dedicate one DBSight instance as server for indexing process, and other several instances as clients to process searching requests.

MySql is giving Java.lang.outOfMemoryError on large ResultSet

Submitted by chris on Sun, 2006-04-30 18:42.

MySql is always giving Java.lang.outOfMemoryError on large ResultSet

After some debugging, on 5.0.0 beta,in MySqlIO.java, function fetchRowsViaCursor, there is a parameter fetchSize

 protected List fetchRowsViaCursor(
  List fetchedRows, long statementId,
  Field[] columnTypes, int fetchSize)

The fetchSize is not really used in

  while ((row = nextRow(columnTypes, columnTypes.length, true,
    ResultSet.CONCUR_READ_ONLY)) != null) {
  	fetchedRows.add(row);
  }

So I changed it to this, recompile, and it works!

  for(int i=0; i< fetchSize&&(
    (row = nextRow(columnTypes, columnTypes.length, true,

Lucene Date Range Search

Submitted by chris on Thu, 2006-04-13 11:17.

In DBSight 1.2.7, there will be range search available:

http://wiki.dbsight.com/index.php?title=Range_Search

So if you want something like this:

 select *
 from projects
 where content like '%keyword%'
 and created_at between to_date('2005/01/01')
    and to_date('2005/06/01')

For Date Range Query, if "created_at" is a Keyword, "Date" or "DateTime" or "Time" type, the query grammar will be like this, search with query:

 keyword created_at:[2005/01/01,2005/06/01]

Lucene api changes for numeric range search

Submitted by chris on Mon, 2006-04-10 10:12.

From Ken,

 Chris, the release notes from DBSight 1.2.6 say that
 it includes Lucene 
 2.0RC1.  That hasn't even been tagged, as far as I
 can see.  You 
 certainly know more about the quality of the Lucene
 source tree than I 
 do, but it really concerns me putting development
 snapshots into 
 production here, especially with the problems I've
 had with regressions 
 in DBSight.  Perhaps you need a public beta process
 or "unstable" 
 releases to introduce new functionality.

Thanks for letting us know your concern! It does make sense that our too aggressive approach may cause problems to production environment.

SourceForge's in-house search project

Submitted by chris on Wed, 2005-06-29 13:04.

SF.net has been doing the same thing DBSight does, except that DBSight is an off-the-shelf product.

Thanks to Chris Conrad! He has written an excellent article on details of SF.net's search project. (I guess it's me who requested this on the lucene mailing list.) At least we know how a big search project usually looks like.

As you may know, DBSight is basically lucene-on-database. Best of all, I've found DBSight is already doing most of their requirements. And it's off-the-shelf!

Search Engine DIY

Submitted by chris on Tue, 2005-06-21 17:55.

For IT managers, there may be one or several databases lying around. They want to do search on the database, while the annoying search always returns "No Results Found". The "Advanced Search", if there is one, is often complicated and hard to use, performance is often slow and resource-consuming.

Why not create a search as simple and elegant as Google? Well, your reply may be, "I don't think I can do that".

Actually you can!

With DBSight, you surely can create a search engine on your database, by Do-It-Yourself!

DBSight is a free-to-download J2EE application. It

  1. Has a scheduler to crawl database updates by JDBC
XML feed