[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Omaha.pm] Embeddable database options.

To: "Perl Mongers of Omaha, Nebraska USA" <omaha-pm@pm.org>
Subject: Re: [Omaha.pm] Embeddable database options.
From: Todd Christopher Hamilton <netarttodd@gmail.com>
Date: Wed, 26 Aug 2009 19:29:35 -0500
Delivered-to: mailman-omaha-pm@mailman.pm.dev
Delivered-to: omaha-pm@pm.org
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=P8tvf0cCRy1a6NxsfyvS1KLpZKvBiVcieF2lOnfNysc=; b=VAf1UK7oakfHPMcmXlp/DkGc3i1FvX3DDpPVnbpEplwkuNRQ/OZFOQ6yKQOobCM8Gb bMJ9LOtUCqswU91ffB6K0uLNxAxV8EbsIjfneuN3580sj1y2IooDgRW53FT9lDvfvXS2 WaYhta4EkjulH10QvbBB6tGpopTiihu0Zq1gI=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=Fir4z0wX9J9mZ98Tn5J5HoAEdHXliOtAID69Nv36Bw/cHClRPbwJg9b9jjmQfAgOhd 4ZAvkbKkqKbJsNtaMrVROd6ALBl/FSXEO4ti1LOnhfRtmLpo04WuEJYZ36ab+BwrTJBX wnN0jwj0vk4pyh9KxxuUWAKAuWiu5qFESOwNE=
In-reply-to: <a4439dea0908241411m17b75744j1a7da148254a22e@mail.gmail.com>
List-archive: <http://mail.pm.org/pipermail/omaha-pm>
List-help: <mailto:omaha-pm-request@pm.org?subject=help>
List-id: "Perl Mongers of Omaha, Nebraska USA" <omaha-pm.pm.org>
List-post: <mailto:omaha-pm@pm.org>
List-subscribe: <http://mail.pm.org/mailman/listinfo/omaha-pm>, <mailto:omaha-pm-request@pm.org?subject=subscribe>
List-unsubscribe: <http://mail.pm.org/mailman/options/omaha-pm>, <mailto:omaha-pm-request@pm.org?subject=unsubscribe>
References: <3e2be50908240717g4b5d6a07s2d8adbf3e71daf5b@mail.gmail.com> <a4439dea0908241411m17b75744j1a7da148254a22e@mail.gmail.com>
Reply-to: "Perl Mongers of Omaha, Nebraska USA" <omaha-pm@pm.org>

I use Sqlite for embeded but you could also look at FireBird. I know
of a cool healthcare project used by the med center that uses FireBird
as a local cache for a fat client.


2009/8/24 Mario Steele <mario@ruby-im.net>:
> Hello Dan,
>
> On Mon, Aug 24, 2009 at 10:17 AM, Dan Linder <dan@linder.org> wrote:
>>
>> Guys,
>>
>> I'm looking at rewriting some of the store/retrieve code in a project
>> I'm working on.  The current method uses the Data::Dumper and eval()
>> code to store data to a hierarchical directory structure on disk.
>> Over the weekend I all but eliminated the hard-disk overhead by moving
>> the data to a temporary RAM disk -- sadly, the speed-ups were too
>> small to notice.  This tells me that the overall Linux file-system
>> caching is working quite well.  (Yay!) Unfortunately, this leads me
>> (again) determine that the Dumper/eval() code is probably the
>> bottle-neck.  (Definately not what they were designed for, but work
>> remarkably well none the less...)
>
> Eval is more then likely your biggest bottleneck.  Dumper not so much, but
> heavy usage of eval in any language, can create a bottleneck in nothing
> flat.
>
>>
>> So, I started investigating alternatives:
>>  * A true database with client/server model (i.e. MySQL, PostgreSQL, etc)
>
> Use MySQL / PostgreSQL is you are going to have many hits to the Perl script
> that is going to be executing.  It does well with threading, and also solves
> the problem mentioned below about SQLite.
>
>>  * An embedded database such as SQLite (others?)
>
> SQLite is a great Database system, for File Based data storage.
> Unfortunately, it stores in binary, so you can't exactly use grep, vi, etc,
> etc, to read the contents of the database file.  But unlike it's big
> brother, you can only have one transactional lock (EG Database Open) at a
> time on a database file.  This is to prevent corruption of the data.  (And
> yes, this locks even if your just doing a read query.)
>
>>
>> * Continue using the filesystem+directory structure using
>> freeze()/thaw() from the FreezeThaw CPAN module (speed improvement?)
>
> I dunno if freeze()/thaw() will do any good, as it still comes down to
> Dumper/eval() to properly store the information.
>
>>
>> * Use a DBD module to store/retrieve these files (i.e. DBD::File,
>> DBD::CSV, etc) (benefit here is that a simple change in the DB setup
>> code will mean a change from DBD::File to DBD::SQLite or
>> DBD::PostgreSQL should be fairly short work)
>
> DBD overall, is a great front end for you to use, for database storage, as
> it gives you a common api across many different DB Backends.  If you want
> consistency, and the ability to test different database storage engines,
> then I would strongly recommend you use DBD.
>
>>
>> Internally I have some constraints:
>>  * We'd like to keep the number of non-core Perl modules down
>> (currently we're 90% core), and a couple customers are extremely
>> sensitive to anything that is not supplied by their OS provider
>> (Solaris and HPUX for example).
>
> This is true in many facets, but you'll find standard that MySQL and SQLite
> are often the biggest thing that is distributed on most Operating Systems
> (Aside from Windows, but we won't go there).
>
>> * We would also like to keep the files on disk and in a
>> human-readable form so the end users and support staff can peruse this
>> data with simple tools (grep, vi, etc).
>
> Again, as stated above, SQLite, and MySQL won't let you use grep, vi, etc,
> to view the data, but simple tools can be created to create the same effect,
> and highly optimize it to specific tasks, instead of looking through
> hundreds of lines of data, to find a specific field.
>
>>
>>  * The remaining 10% that is non-core Perl modules are local copies of
>> "pure perl" CPAN modules we've merged into the source code branch
>> directly.  (We do this because the code runs on Solaris/SPARC,
>> Solaris/x86_64, Linux/x86, Linux/ia64, HPUX/PA-RISC, HPUX/ia64, etc)
>>
>> My personal pick at the moment is SQLite (it is provided natively in
>> Solaris 10, and easy to install on Linux platforms), but I question if
>> the speed up it provides will be over-shadowed by the constant
>> spawning of the sqlite binary each time an element of data is queried.
>>  (Anyone know if there is a way to leave a persistent copy of SQLite
>> running in memory that future copies hook into?  Getting a bit far
>> afield from the initial SQLite implementation goals...)
>
> Now, I come to this, after explaining the above to you, and I will be
> directly to the point.  SQLite Binary (or BLOB) data types, while may seem
> to be huge for data allocation and stuff, is actually quite minimal in
> overall speed.  This especially can be optimized when you need to look at
> specific data fields, and could care less about the rest.  As well with
> anything else, SQLite does have overheads, but not nearly as much as you
> might think.  It only allocates the data needed to return the results of a
> SQL query, or insert data into the database.
>
> The SQLite team has put much effort into optimizing the SQLite engine, so
> that it can store, as well as retrieve data in the most efficient manner
> possible, and keep the engine fast, and properly working.  Many Linux
> distributions (Ubuntu among most), use SQLite for a large amount of storage
> within their own system, such as APT/Aptitude/Synaptic.  Using SQLite can
> have it's advantages, but also it's downfalls to.  If your wanting to avoid
> database locking issues, then I suggest MySQL.  If your looking for Light
> weight solution, that is quick, and not so much a worry about Locking
> issues, then I would suggest SQLite.
>
>>
>> Thanks for any insight,
>>
>> DanL
>>
>> --
>> ******************* ***************** ************* ***********
>> ******* ***** *** **
>> "Quis custodiet ipsos custodes?" (Who can watch the watchmen?) -- from
>> the Satires of Juvenal
>> "I do not fear computers, I fear the lack of them." -- Isaac Asimov
>> (Author)
>> ** *** ***** ******* *********** ************* *****************
>> *******************
>> _______________________________________________
>> Omaha-pm mailing list
>> Omaha-pm@pm.org
>> http://mail.pm.org/mailman/listinfo/omaha-pm
>
> Hope this helps, and it is just my own two cents on the deal.
>
> --
> Mario Steele
> http://www.trilake.net
> http://www.ruby-im.net
> http://rubyforge.org/projects/wxruby/
> http://rubyforge.org/projects/wxride/
>
> _______________________________________________
> Omaha-pm mailing list
> Omaha-pm@pm.org
> http://mail.pm.org/mailman/listinfo/omaha-pm
>



-- 
Todd Christopher Hamilton
(402) 660-2787

References:
- [Omaha.pm] Embeddable database options.
  - From: Dan Linder <dan@linder.org>
- Re: [Omaha.pm] Embeddable database options.
  - From: Mario Steele <mario@ruby-im.net>

Prev by Date: Re: [Omaha.pm] Embeddable database options.
Next by Date: [Omaha.pm] Fwd: Perl for Android
Previous by thread: Re: [Omaha.pm] Embeddable database options.
Next by thread: [Omaha.pm] Fwd: Perl for Android
Index(es):
- Date
- Thread