Building a search engine - Webmaster Forums - Webmaster forum for HTML, PHP, ASP, CSS and more
Webmaster Forums - Webmaster forum for HTML, PHP, ASP, CSS and more
Go Back   Webmaster Forums - Webmaster forum for HTML, PHP, ASP, CSS and more > Webmaster Tech > Programming > PHP Development

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old 07-07-2006, 10:56 AM   #1 (permalink)
Junior Member
 
Join Date: Mar 2006
Posts: 19
Default Building a search engine

Hi

I have to build a specialised search engine and want to know the best way to process large volumes of random pages from the web. I have no prior SE building knowledge and therefore little idea as how to proceed. Basically I need to index pages in a particular way which seems to be the easy part, I just don’t know how to get the pages off the web in the first place. I have looked on the web and most of the info there refers to intranets or large sites where the location of pages is assumed to be known. A vague and naïve question but hopefully you get he nub of my gist.
Don Logan is offline   Reply With Quote
Sponsored Links
Old 09-28-2006, 11:37 PM   #2 (permalink)
Junior Member
 
Join Date: Sep 2006
Location: WhiteCourt Alberta, Canada
Posts: 3
Default Re: Building a search engine

Hi

I have built one.

First I had it do text search, with hiliting and line wraps etc.

Then I had it display a picture when a match was found and
the next line was the path to a picture.

It could easily be altered to have a url instead of a picture.
every line of text in the website would have a matching line
pointing to the page it came from.

It strictly works from text files. To speed thing up and eliminate
all the directory overhead. I merge all the files I want to search
into a large file. Then search that file, listing the file name it came
from. (When merging, I place a break between files with the file name in it)

All my emails, notes and net clippings since 1999 is 75MB
At 20,000,000 characters per second, it only takes 3 seconds
to search everything. No indexing required and the data is
portable.
SpectateSwamp is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



» Sponsors

» Links

» Affiliates
Web Hosting
Online Backup Reviews
Marketing Find
Merchant Select
SiteMap Builder
Host Compare

» Links

» Sports Network
Paintball Forum
Football Forum
Hockey Forum
Golf Forum
Boxing Forum
Lacrosse Forum
Baseball Forum
SnowBoarding Forum
Soccer Forum
MMA Forum


All times are GMT -4. The time now is 09:32 AM.


Powered by vBulletin® Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.3.2
Webmaster Forums
Web Hosting | Chicago Web Hosting | Web Hosting