Xref: feenix.metronet.com comp.infosystems.gopher:4650
Newsgroups: comp.infosystems.gopher
Path: feenix.metronet.com!news.ecn.bgu.edu!usenet.ins.cwru.edu!howland.reston.ans.net!europa.eng.gtefsd.com!darwin.sura.net!wupost!newsfeed.rice.edu!rice!riddle
From: riddle@is.rice.edu (Prentiss Riddle)
Subject: g2linkdb (was re: Waisindexing a .Link)
Message-ID: <CC67sH.89G@rice.edu>
Sender: news@rice.edu (News)
Organization: Ministry of Information, William's Marsh
References: <1993Aug18.222458.14751@selway.umt.edu>
Date: Sun, 22 Aug 1993 17:12:17 GMT
Lines: 167

jdc@selway.umt.edu (John-David Childs) writes:
> Anyone figured out how to waisindex a .Link?  For instance, if I search
> for the word humor, I want gopher to find the LINK to wiretap's humor
> directory and if I press enter, I should be able to go there.  What it
> does now is return the .Link file (as a text file).  I imagine this
> would take some modification to either the wais or gopher code (or
> maybe both).

Here's the inelegant solution I came up with when facing this problem:
a go4gw module called "g2linkdb".

I wrote it because the Yale collection of online library catalogs used
to return a mixture of text descriptions of catalogs and Gopher
".links" entries pointing to them.   Without this hack, the .links
entries would show up in menus as ordinary text.  I wanted a way to
turn them into "live" links.  (I believe that Marie-Christine Mahe of
Yale has since implemented something like this herself.)

If I were in charge of a database of links to resources like
Marie-Christine's, I would put a copy of each resource description into
the .links files, commented out with # signs.  Thus a search for
"widget" could return a live link to an entry listed like this:

Name=Nuts and Bolts Database (from the University of Foobar)
Type=1
Port=70
Path=1/nuts+bolts
Host=foobar.edu
# The Nuts and Bolts Database is a project of the Department of Widget
# Studies, College of Mechanical Engineering, Foobar University.  It
# is especially noted for its information on reverse-threaded nuts and
# bolts for use in the southern hemisphere.

One of these days, I may write something similar that scans each text
file for URLs and turns the Gopher-compatible ones into "live" links as
well...

(BTW, for the person who said that what was needed was a new WAIS
type:  maybe I'm missing something, but I can't figure out how a new
WAIS type could tell Gopher to treat the items returned as links
instead of plain text.  For use with g2linkdb, the usual "paragraph"
type is sufficient.)

-- Prentiss Riddle ("aprendiz de todo, maestro de nada") riddle@rice.edu
-- Unix Systems Programmer, Office of Networking and Computing Systems
-- Rice University, POB 1892, Houston, TX 77251 / Mudd 208 / 713-285-5327
-- Opinions expressed are not necessarily those of my employer.
------------------------------< cut here >------------------------------
#!/usr/local/bin/perl
#
# g2linkdb -- go4gw module to process a WAIS (or other) search whose
#             results may include Gopher links
#
# History:
# 04/07/93 PASR	Original version by Prentiss Riddle (riddle@rice.edu).

#----------------------------------------------------------------------
# variables you should change:

$linkdb_maxhits = 40;	# Maximum number of search results to accept

#----------------------------------------------------------------------


sub linkdb_main {
	local($_) = @_;
	local($host, $port, $path, $search, @items);
	local($hits);

#	$Gdebug = 1;
	
	# Check input format
	# (May want to modify this later to require a search spec!)
	unless ( ($host, $port, $path, $search) =
		$_ =~ /^([^:\t]*):([^:\t]*):([^\t]*)\t?(.*)$/) {
		&Greply("0Please specify an index to search and a selector string.\t\t\t");
		&Greply(".");
		exit(0);
	}

	# Make initial query of the specified server.
	&GopenServer($host, $port);
	if ($search) {
		&Gsend("$path\t$search");
	} else {
		&Gsend("$path");
	}
	# Gather the resulting items.
	$hits = $linkdb_maxhits;
	while ($_ ne "." && $hits-- > 0) {
		$_ = &Grecv;
		push(@items, "$_") if (/^.*\t.*\t.*\t.*$/);
	}
	&GcloseServer;

	# Look through the collected items, looking for ones which might
	# be files containing Gopher links.
	while ($_ = shift(@items)) {
		if (&linkdb_link("$_")) {
			&linkdb_chase("$_");
		} else {
			&Greply("$_");
		}
	}

	&Greply(".");
	exit(0);
}

#----------------------------------------------------------------------
# Function which tries to guess from the Gopher title whether an item
# is a file containing a Gopher link.  You may want to change this for
# a specific local application.

sub linkdb_link {
	local($_) = @_;

	# Match anything that looks like a line in a ".links" file.
	return(1) if (/^0(Name|Type|Host|Port|Path)=/);

	# Match the format used by the Yale libraries.
	return(1) if (m#^.\.\S+\s*/Libraries/by\.place/#);

	# Give up.
	return(0);
}

#----------------------------------------------------------------------
# Subroutine which fetches a Gopher file and tries to parse it as a link
# to a Gopher item.  If successful, it returns the resulting link via
# &Greply; if not, it returns the original item via &Greply.

sub linkdb_chase {
	local($chaseitem) = @_;
	local($chasename, $chasepath, $chasehost, $chaseport);
	local($Name, $Type, $Host, $Port, $Path);
	local($_, $hits);

	if (($chasename, $chasepath, $chasehost, $chaseport) =
		    $chaseitem =~ /^0([^\t]*)\t([^\t]*)\t([^\t]*)\t([^\t]*)/) {
		$Name = $Type = $Host = $Port = $Path = "";
		&GopenServer($chasehost, $chaseport);
		&Gsend("$chasepath");
		$_ = &Grecv;
		$hits = $linkdb_maxhits;
		while ($_ ne "." && $hits-- > 0) {
			$Name = $1 if (/^\s*Name=([^\t]*)/);
			$Type = $1 if (/^\s*Type=([^\t]*)/);
			$Host = $1 if (/^\s*Host=([^\t]*)/);
			$Port = $1 if (/^\s*Port=([^\t]*)/);
			$Path = $1 if (/^\s*Path=([^\t]*)/);
			$_ = &Grecv;
		}
		$GcloseServer;
		if ($Name && $Type && $Host && $Port) {
			# Success! Return the link we parsed.
			&Greply("$Type$Name\t$Path\t$Host\t$Port");
			return;
		}
	}

	# We didn't parse this as a link -- return original item.
	&Greply("$chaseitem");
}
#----------------------------------------------------------------------

1; # for require
