[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Omaha.pm] [Fwd: Re: Gedcom.pm - searching for a path from person A to person X?]
Paul's reply...
j
-------- Original Message --------
Subject: Re: Gedcom.pm - searching for a path from person A to person X?
Date: Mon, 23 Jan 2006 17:43:19 +0100
From: Paul Johnson <paul@pjcj.net>
To: Jay Hannah <jay@jays.net>
CC: Omaha Perl Mongers <omaha-pm@pm.org>
References: <e4f68fcdb67f7986800bd8dddb266d79@jays.net>
On Sat, Nov 26, 2005 at 12:52:27PM -0600, Jay Hannah wrote:
Hi Paul --
Hello Jay (and omaha.pm),
First, my apologies for a delayed reply.
I've got a little over 4000 people in my GEDCOM
http://jays.net/genealogy/
That's pretty impressive. I wish I had that much information, but I
spend too much time hacking Gedcom.pm and too little time actually
researching.
I'm also a Perl hacker, and was playing with your Gedcom.pm and
wondering if you have any logic to find paths from arbitrary points in
GEDCOMs to other arbitrary points.
Something like
The path from INDI 0012 to INDI 1077:
Start: INDI 0012
Father: INDI 0078
Mother: INDI 1032
Married: INDI 1999
Daughter: INDI 1066
Married: INDI 0887
Son: INDI 1077
Once you found the path, of course, you could make it as pretty as you
wanted to, Names, dates, etc.
Does such a thing exist?
Im not sure whether anyone has written such a thing directly in Perl,
but I do know of a lifelines script called "relation" that does (mostly)
what you are after. This script can be translated into Perl using
lines2perl which is a part of the Gedcom.pm release.
I've reproduced the script below. (It required a small change to
lines2perl.) You can run this directly, or use it to learn from. The
original script is after __END__. The Perl is not wonderful, but I
don't think it's too bad for an automatic translation.
As an aside, lifelines has some really nice reports. For the most part,
lines2perl will translate them into Perl for you. (There might be
problems with some of the newer ones if they use new features I haven't
implemented yet.) See http://lifelines.sourceforge.net/ and
ftp://ftp.cac.psu.edu/pub/genealogy/lines/reports/
Let me know if you run into any problems.
If not, do you have any thoughts about how I
might right it? Any interest in this being a plugin to Gedcom.pm?
Gedcom::Search or something?
If you'd like to write some general method for this I'd be very happy to
include it, either within the distribution or by providing some sort of
plugin support. If you do this, you might like to subscribe to the
mailing list (details in the README) to keep us informed.
Thanks!
#!/usr/local/bin/perl -w
# This program was generated by lines2perl, which is part of Gedcom.pm.
# Gedcom.pm is Copyright 1999-2005, Paul Johnson (pjcj@cpan.org)
# Version 1.1502 - 20th December 2005
# Gedcom.pm is free. It is licensed under the same terms as Perl itself.
# The latest version of Gedcom.pm should be available from my homepage:
# http://www.pjcj.net
use strict;
require 5.005;
use diagnostics;
use integer;
use Getopt::Long;
use Gedcom::LifeLines 1.1502;
my $Ged; # Gedcom object
my %Opts; # options
my $_Traverse_sub; # subroutine for traverse
sub out { print STDERR @_ unless $Opts{quiet} }
sub outf { printf STDERR @_ unless $Opts{quiet} }
sub initialise ()
{
die "usage: $0 -gedcom_file file.ged\n"
unless GetOptions(\%Opts,
"gedcom_file=s",
"quiet!",
"validate!",
) and defined $Opts{gedcom_file};
local $SIG{__WARN__} = sub { out "\n@_" };
out "reading...";
$Ged = Gedcom->new
(
gedcom_file => $Opts{gedcom_file},
callback => sub { out "." }
);
if ($Opts{validate})
{
out "\nvalidating...";
my %x;
my $vcb = sub
{
my ($r) = @_;
my $t = $r->{xref};
out "." if $t && !$x{$t}++;
};
$Ged->validate($vcb);
}
out "\n";
set_ged($Ged);
}
$SIG{__WARN__} = sub
{
out $_[0] unless $_[0] =~ /^Use of uninitialized value/
};
# /*
# relation - a LifeLines relation computing program
# by Jim Eggert (eggertj@atc.ll.mit.edu)
# Version 1, 21 November 1992
# Version 2, 23 November 1992 (completely revamped)
# Version 3, (changed format slightly, modified code somewhat)
# Version 4, 6 July 1993 (added English language)
# Version 5, 6 September 1993 (generified language)
#
# This program calculates the relationship between individuals in a
# database. It does so in three modes. Mode 1 just does one pair of
# individuals and then exits. Mode 2 does any number of pairs with a
# common "from" person. Mode 3 does all pairs with a common "from"
# person. In general, mode 1 is fastest for simple relationships, but
# if you want one complicated relationship, you may as well do them all.
#
# Each computed relation is composed of the minimal combination of
# parent (fm), sibling (bsS), child (zdC), and spouse (hw) giving the
# relational path from the "from" person to the "to" person. Each
# incremental relationship (or hop) is coded as follows, with the
# capital letters denoting a person of unknown gender:
# father f
# mother m
# parent P (not used)
# brother b
# sister s
# sibling S
# son z (sorry)
# daughtr d
# child C
# husband h
# wife w
# spouse O (sorry again, but usually not possible)
#
# The report gives the steps required to go from the first person to
# the second person. Thus the printout
# I93 John JONES fmshwz I95 Fred SMITH
# means that John Jones' father's mother's sister's husband's wife's son
# is Fred Smith. Notice in this case, the sister's husband's wife is
# not the same as the sister, and the husband's wife's son is not the
# same as the husband's son. Thus in more understandable English, John
# Jones' paternal grandmother's sister's husband's wife's son from
# another marriage is Fred Smith.
#
# The program will do a trivial parsing of the path string. You can
# change the language_table to have it print in different languages, as
# long as the word order is unchanged.
#
# If there is no relation, the program says so. That at least should be
# easy to explain. Mode 3 only prints out those individuals who are
# related to the "from" individual.
# */
my $plist;
my $hlist;
my $mark;
my $keys;
my $found;
my $do_names;
my $language;
my $language_table;
my $token;
my $untoken;
sub include ($$$$$)
{
my($person, $hops, $keypath, $path, $pathend) = @_;
my $and;
my $entry;
my $eq;
my $pkey;
if (($person && ($found == 0)))
{
$pkey = &key($person);
if ($entry = $mark->{$pkey})
{
if ((&strcmp($entry, "is not related to") == 0))
{
$found = 1;
$plist = [];
$hlist = [];
$mark->{&save($pkey)} = &save(&concat($path, $pathend));
$keys->{&save($pkey)} = &save(&concat(&concat($keypath, "@"), $pkey));
}
}
else
{
push @$plist, &save($pkey);
push @$hlist, $hops;
$mark->{&save($pkey)} = &save(&concat($path, $pathend));
$keys->{&save($pkey)} = &save(&concat(&concat($keypath, "@"), $pkey));
}
}
undef
}
sub get_token ($)
{
my($input) = @_;
my $eq;
my $first_delim;
my $not;
# /* Parse a token from the input string.
# Tokens are separated by one or more "@"s.
# Set global parameter token to the first token string.
# Set global parameter untoken to the rest of the string after first token.
# */
# /* strip leading @s */
#
$untoken = &save($input);
$first_delim = &index($untoken, "@", 1);
LOOP: while (($first_delim == 1))
{
$untoken = &save(&substring($untoken, 2, &strlen($untoken)));
$first_delim = &index($untoken, "@", 1);
}
# /* get token and untoken */
#
if ((! $first_delim))
{
$token = &save($untoken);
$untoken = &save("");
}
else
{
$token = &save(&substring($untoken, 1, ($first_delim - 1)));
$untoken = &save(&substring($untoken, ($first_delim + 1), &strlen($untoken)));
}
undef
}
sub parse_relation ($$)
{
my($relation, $keypath) = @_;
my $charcounter;
my $le;
my $not;
my $strlen;
if ((! $language))
{
display " ";
display $relation;
if ($do_names)
{
$untoken = $keypath;
display &get_token($untoken);
LOOP: while (&strlen($untoken))
{
display &get_token($untoken);
display " ";
display $token;
display " ";
display &name(&indi($token));
}
}
display " ";
}
else
{
$charcounter = 1;
$untoken = $keypath;
display &get_token($untoken);
LOOP: while (($charcounter <= &strlen($relation)))
{
display $language_table->{&substring($relation, $charcounter, $charcounter)};
if ($do_names)
{
display &get_token($untoken);
display " ";
display $token;
display " ";
display &name(&indi($token));
}
$charcounter = ($charcounter + 1);
}
display " is ";
}
undef
}
sub main ()
{
my $child;
my $cnum;
my $eq;
my $fam;
my $female;
my $from_key;
my $from_person;
my $hopcount;
my $keypath;
my $male;
my $mode;
my $ne;
my $num;
my $path;
my $pathend;
my $person;
my $pkey;
my $pnum;
my $prev_hopcount;
my $spouse;
my $strcmp;
my $to_key;
my $to_person;
my $want_another;
$mark = {};
$keys = {};
$plist = [];
$hlist = [];
$language_table = {};
$language_table->{"f"} = "'s father";
$language_table->{"m"} = "'s mother";
$language_table->{"P"} = "'s parent";
$language_table->{"b"} = "'s brother";
$language_table->{"s"} = "'s sister";
$language_table->{"S"} = "'s sibling";
$language_table->{"z"} = "'s son";
$language_table->{"d"} = "'s daughter";
$language_table->{"C"} = "'s child";
$language_table->{"h"} = "'s husband";
$language_table->{"w"} = "'s wife";
$language_table->{"O"} = "'s spouse";
display &getindimsg($from_person, "Enter person to compute relation from:");
$from_key = &save(&key($from_person));
$hopcount = 0;
$prev_hopcount = (- 1);
$found = 0;
display &include($from_person, $hopcount, "", "", "");
display &getintmsg($mode, "Enter 1 for a single relation, 2 for several, 3 for all:");
display &getintmsg($language, "Enter 0 for brief, 1 for English-language relationships:");
display &getintmsg($do_names, "Enter 0 to omit, 1 to output names of all intervening relatives:");
if (($mode == 1))
{
display &getindimsg($to_person, "Enter one person to compute relation to:");
$to_key = &save(&key($to_person));
if (&strcmp($from_key, $to_key))
{
$mark->{$to_key} = "is not related to";
}
else
{
$plist = [];
$hlist = [];
}
}
LOOP: while ($pkey = (shift @$plist))
{
$person = &indi($pkey);
$hopcount = (shift @$hlist);
$path = $mark->{$pkey};
$keypath = $keys->{$pkey};
if (($hopcount != $prev_hopcount))
{
display &print(".");
$prev_hopcount = $hopcount;
}
$hopcount = ($hopcount + 1);
display &include(&father($person), $hopcount, $keypath, $path, "f");
display &include(&mother($person), $hopcount, $keypath, $path, "m");
$cnum = 0;
LOOP: for $child ( do { my $e = &parents($person); $e ? $e->children : ()} )
{
$cnum++;
if (&male($child))
{
$pathend = "b";
}
elsif (&female($child))
{
$pathend = "s";
}
else
{
$pathend = "S";
}
display &include($child, $hopcount, $keypath, $path, $pathend);
}
$pnum = 0;
LOOP: for $fam ($person->fams)
{
for $spouse ($fam->parents || undef)
{
next if $spouse && $spouse->xref eq $person->xref;
$pnum++;
if (&male($spouse))
{
$pathend = "h";
}
elsif (&female($spouse))
{
$pathend = "w";
}
else
{
$pathend = "O";
}
display &include($spouse, $hopcount, $keypath, $path, $pathend);
$cnum = 0;
LOOP: for $child ( do { my $e = $fam; $e ? $e->children : ()} )
{
$cnum++;
if (&male($child))
{
$pathend = "z";
}
elsif (&female($child))
{
$pathend = "d";
}
else
{
$pathend = "C";
}
display &include($child, $hopcount, $keypath, $path, $pathend);
}
}
}
}
if (($mode == 1))
{
display $from_key;
display " ";
display &name(&indi($from_key));
display &parse_relation(&save($mark->{$to_key}), $keys->{$to_key});
display $to_key;
display " ";
display &name(&indi($to_key));
display "\n";
}
if (($mode == 2))
{
$want_another = 1;
LOOP: while ($want_another)
{
display &getindimsg($to_person, "Enter person to compute relation to:");
$to_key = &save(&key($to_person));
display $from_key;
display " ";
display &name(&indi($from_key));
if ($path = $mark->{$to_key})
{
display &parse_relation(&save($path), $keys->{$to_key});
}
else
{
display " is not related to ";
}
display $to_key;
display " ";
display &name($to_person);
display "\n";
display &getintmsg($want_another, "Enter 0 if done, 1 if you want another to person:");
}
}
if (($mode == 3))
{
display $from_key;
display " ";
display &name(&indi($from_key));
display " --->\n";
$num = 0;
LOOP: for $to_person ($Ged->individuals)
{
$num++;
$to_key = &save(&key($to_person));
if ($path = $mark->{$to_key})
{
display &parse_relation(&save($path), $keys->{$to_key});
display $to_key;
display " ";
display &name($to_person);
display "\n";
}
}
}
undef
}
initialise();
main();
flush();
0
__END__
Original LifeLines program follows:
/*
relation - a LifeLines relation computing program
by Jim Eggert (eggertj@atc.ll.mit.edu)
Version 1, 21 November 1992
Version 2, 23 November 1992 (completely revamped)
Version 3, (changed format slightly, modified code somewhat)
Version 4, 6 July 1993 (added English language)
Version 5, 6 September 1993 (generified language)
This program calculates the relationship between individuals in a
database. It does so in three modes. Mode 1 just does one pair of
individuals and then exits. Mode 2 does any number of pairs with a
common "from" person. Mode 3 does all pairs with a common "from"
person. In general, mode 1 is fastest for simple relationships, but
if you want one complicated relationship, you may as well do them all.
Each computed relation is composed of the minimal combination of
parent (fm), sibling (bsS), child (zdC), and spouse (hw) giving the
relational path from the "from" person to the "to" person. Each
incremental relationship (or hop) is coded as follows, with the
capital letters denoting a person of unknown gender:
father f
mother m
parent P (not used)
brother b
sister s
sibling S
son z (sorry)
daughtr d
child C
husband h
wife w
spouse O (sorry again, but usually not possible)
The report gives the steps required to go from the first person to
the second person. Thus the printout
I93 John JONES fmshwz I95 Fred SMITH
means that John Jones' father's mother's sister's husband's wife's son
is Fred Smith. Notice in this case, the sister's husband's wife is
not the same as the sister, and the husband's wife's son is not the
same as the husband's son. Thus in more understandable English, John
Jones' paternal grandmother's sister's husband's wife's son from
another marriage is Fred Smith.
The program will do a trivial parsing of the path string. You can
change the language_table to have it print in different languages, as
long as the word order is unchanged.
If there is no relation, the program says so. That at least should be
easy to explain. Mode 3 only prints out those individuals who are
related to the "from" individual.
*/
global(plist)
global(hlist)
global(mark)
global(keys)
global(found)
global(do_names)
global(language)
global(language_table)
global(token)
global(untoken)
proc include(person,hops,keypath,path,pathend)
{
if (and(person,eq(found,0))) {
set(pkey,key(person))
if (entry,lookup(mark,pkey)) {
if (eq(strcmp(entry,"is not related to"),0)) {
set(found,1)
list(plist)
list(hlist)
insert(mark,save(pkey),save(concat(path,pathend)))
insert(keys,save(pkey),save(concat(concat(keypath,"@"),pkey)))
}
}
else {
enqueue(plist,save(pkey))
enqueue(hlist,hops)
insert(mark,save(pkey),save(concat(path,pathend)))
insert(keys,save(pkey),save(concat(concat(keypath,"@"),pkey)))
}
}
}
proc get_token(input) {
/* Parse a token from the input string.
Tokens are separated by one or more "@"s.
Set global parameter token to the first token string.
Set global parameter untoken to the rest of the string after first token.
*/
/* strip leading @s */
set(untoken,save(input))
set(first_delim,index(untoken,"@",1))
while (eq(first_delim,1)) {
set(untoken,save(substring(untoken,2,strlen(untoken))))
set(first_delim,index(untoken,"@",1))
}
/* get token and untoken */
if (not(first_delim)) {
set(token,save(untoken))
set(untoken,save(""))
}
else {
set(token,save(substring(untoken,1,sub(first_delim,1))))
set(untoken,save(
substring(untoken,add(first_delim,1),strlen(untoken))))
}
}
proc parse_relation(relation,keypath) {
if (not(language)) {
" " relation
if (do_names) {
set(untoken,keypath)
call get_token(untoken)
while(strlen(untoken)) {
call get_token(untoken)
" " token " " name(indi(token))
}
}
" "
}
else {
set(charcounter,1)
set(untoken,keypath)
call get_token(untoken)
while (le(charcounter,strlen(relation))) {
lookup(language_table,substring(relation,charcounter,charcounter))
if (do_names) {
call get_token(untoken)
" " token " " name(indi(token))
}
set(charcounter,add(charcounter,1))
}
" is "
}
}
proc main ()
{
table(mark)
table(keys)
list(plist)
list(hlist)
table(language_table)
insert(language_table,"f","'s father")
insert(language_table,"m","'s mother")
insert(language_table,"P","'s parent")
insert(language_table,"b","'s brother")
insert(language_table,"s","'s sister")
insert(language_table,"S","'s sibling")
insert(language_table,"z","'s son")
insert(language_table,"d","'s daughter")
insert(language_table,"C","'s child")
insert(language_table,"h","'s husband")
insert(language_table,"w","'s wife")
insert(language_table,"O","'s spouse")
getindimsg(from_person,
"Enter person to compute relation from:")
set(from_key,save(key(from_person)))
set(hopcount,0)
set(prev_hopcount,neg(1))
set(found,0)
call include(from_person,hopcount,"","","")
getintmsg(mode,"Enter 1 for a single relation, 2 for several, 3 for all:")
getintmsg(language,
"Enter 0 for brief, 1 for English-language relationships:")
getintmsg(do_names,
"Enter 0 to omit, 1 to output names of all intervening relatives:")
if (eq(mode,1)) {
getindimsg(to_person,
"Enter one person to compute relation to:")
set(to_key,save(key(to_person)))
if (strcmp(from_key,to_key)) {
insert(mark,to_key,"is not related to")
}
else {
list(plist)
list(hlist)
}
}
while (pkey,dequeue(plist)) {
set(person,indi(pkey))
set(hopcount,dequeue(hlist))
set(path,lookup(mark,pkey))
set(keypath,lookup(keys,pkey))
if (ne(hopcount,prev_hopcount)) {
print(".")
set(prev_hopcount,hopcount)
}
set(hopcount,add(hopcount,1))
call include(father(person),hopcount,keypath,path,"f")
call include(mother(person),hopcount,keypath,path,"m")
children(parents(person),child,cnum) {
if (male(child)) { set(pathend,"b") }
elsif (female(child)) { set(pathend,"s") }
else { set(pathend,"S") }
call include(child,hopcount,keypath,path,pathend)
}
families(person,fam,spouse,pnum) {
if (male(spouse)) { set(pathend,"h") }
elsif (female(spouse)) { set(pathend,"w") }
else { set(pathend,"O") }
call include(spouse,hopcount,keypath,path,pathend)
children(fam,child,cnum) {
if (male(child)) { set(pathend,"z") }
elsif (female(child)) { set(pathend,"d") }
else { set(pathend,"C") }
call include(child,hopcount,keypath,path,pathend)
}
}
}
if (eq(mode,1)) {
from_key " " name(indi(from_key))
call parse_relation(save(lookup(mark,to_key)),lookup(keys,to_key))
to_key " " name(indi(to_key)) "\n"
}
if (eq(mode,2)) {
set(want_another,1)
while (want_another) {
getindimsg(to_person,"Enter person to compute relation to:")
set(to_key,save(key(to_person)))
from_key " " name(indi(from_key))
if (path,lookup(mark,to_key)) {
call parse_relation(save(path),lookup(keys,to_key))
}
else { " is not related to " }
to_key " " name(to_person) "\n"
getintmsg(want_another,
"Enter 0 if done, 1 if you want another to person:")
}
}
if (eq(mode,3)) {
from_key " " name(indi(from_key)) " --->\n"
forindi(to_person,num) {
set(to_key,save(key(to_person)))
if (path,lookup(mark,to_key)) {
call parse_relation(save(path),lookup(keys,to_key))
to_key " " name(to_person) "\n"
}
}
}
}
--
Paul Johnson - paul@pjcj.net
http://www.pjcj.net