[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Omaha.pm] Embeddable database options.



Guys,

I'm looking at rewriting some of the store/retrieve code in a project
I'm working on.  The current method uses the Data::Dumper and eval()
code to store data to a hierarchical directory structure on disk.
Over the weekend I all but eliminated the hard-disk overhead by moving
the data to a temporary RAM disk -- sadly, the speed-ups were too
small to notice.  This tells me that the overall Linux file-system
caching is working quite well.  (Yay!) Unfortunately, this leads me
(again) determine that the Dumper/eval() code is probably the
bottle-neck.  (Definately not what they were designed for, but work
remarkably well none the less...)

So, I started investigating alternatives:
 * A true database with client/server model (i.e. MySQL, PostgreSQL, etc)
 * An embedded database such as SQLite (others?)
 * Continue using the filesystem+directory structure using
freeze()/thaw() from the FreezeThaw CPAN module (speed improvement?)
 * Use a DBD module to store/retrieve these files (i.e. DBD::File,
DBD::CSV, etc) (benefit here is that a simple change in the DB setup
code will mean a change from DBD::File to DBD::SQLite or
DBD::PostgreSQL should be fairly short work)

Internally I have some constraints:
 * We'd like to keep the number of non-core Perl modules down
(currently we're 90% core), and a couple customers are extremely
sensitive to anything that is not supplied by their OS provider
(Solaris and HPUX for example).
 * We would also like to keep the files on disk and in a
human-readable form so the end users and support staff can peruse this
data with simple tools (grep, vi, etc).
 * The remaining 10% that is non-core Perl modules are local copies of
"pure perl" CPAN modules we've merged into the source code branch
directly.  (We do this because the code runs on Solaris/SPARC,
Solaris/x86_64, Linux/x86, Linux/ia64, HPUX/PA-RISC, HPUX/ia64, etc)

My personal pick at the moment is SQLite (it is provided natively in
Solaris 10, and easy to install on Linux platforms), but I question if
the speed up it provides will be over-shadowed by the constant
spawning of the sqlite binary each time an element of data is queried.
 (Anyone know if there is a way to leave a persistent copy of SQLite
running in memory that future copies hook into?  Getting a bit far
afield from the initial SQLite implementation goals...)

Thanks for any insight,

DanL

-- 
******************* ***************** ************* ***********
******* ***** *** **
"Quis custodiet ipsos custodes?" (Who can watch the watchmen?) -- from
the Satires of Juvenal
"I do not fear computers, I fear the lack of them." -- Isaac Asimov (Author)
** *** ***** ******* *********** ************* *****************
*******************