What are we going to do
We are going to build software to refine text information found in the Internet for the user’s exact requirements. The software will produce a generated document with the required information refined from the Internet.
When a user enters keywords and search, it will search for those keywords in existing search engines available on the Internet simultaneously. As the results are returned it will again request for the links in the search results. From those result pages, the software will filter out the human readable text. From those texts it will then categorize each sentence according to the relevance to the required subject field, to the interest of the user.
How are we going to do this
We have to go in step-by-step approach. Then we have to plan levels to complete one by one. After an each level we will be able to deliver a version of software that is usable. At the end we are going to make the software to learn itself new words and the relevancy to human subject fields, and able to categorize sentences and paragraphs of information to categories.
In the first level we are making something little more than a web browser. There will be a text field to enter keywords and a search button instead of the address bar. When the user enters keywords it will send search requests to Google or Yahoo or both. Then it will filter out the search results links on a page.
In the second level we are going to develop this to visit the resulted pages simultaneously and filter text information on the pages. It will separate the paragraphs and sentences and assign them to arrays to process further.
Before going to further levels we need to make another expert system, for the computer to understand the relevancy of a word to a specific Human Subject Field. We will discuss about the functionality of this module in MRM Dictionary document. For now, consider it as a “closed module that will work”. When we query for a word it will return an array of values telling what is that word mean to a human being. Read Relevancy of a Word to a Human Subject Field Document for further information.
By using the MRM dictionary we are going to make the third level version of the software; that will categorize the search results to subjects. It will query for each word in a sentence for the relevancy to subjects. The result will be a table of subjects and the percentage relevancy. We have to make the average and sort the subjects that the sentence is appropriate.
This page has the following sub pages.
