Feeds:
Posts
Comments

Sentences listing to a DataGridView

    I have written the functions to filter the complete sentences and to list them to a DataGridView. Then only I realized that some sentences are referring to another sentence. Sentences starting with therefore, also, so, (in) this, (in) that, (in) these, (in) those were most probably referring to the sentence just before that. We should not measure for the subject relevancy for these sentences and put them only if the sentence before that is already selected to put.

    There may be many other things we have to think (like this). Could you please check out and reply to this if you came across with a similar issue.

Sometimes the information we are searching for can come from places we do not expect. So read everywhere and find some valueble stuff to add to our project. I don’t mean that we should add all the “existing” fancy technologies, something “conceptual” we can use. I have read in a book in the library that a viewer coming to a web page goes up to three or two pages through links. The book was “Web Developing, Complete Reference”.

On the other hand we can’t expect the information directly from the page where Google takes us. The viewer has to go through links and find what he wants. So we have to cover that also in our software. I mean when it goes through a page it should also check the relevent links (By the link text) for information.

So I expect you all to “Think” and “Write down”  something, then “Draw” diagrams. These diagrams are not necessory to be UML standerd diagrams. Any sketch that will help other’s to understand the concept will be good. I appreciate even if you just think about some “concepts” for the project. Thank you!

References
Web Developing, Complete Reference.
Thomas A. Powell

Meaning of a word

        In this section we will discuss about the meaning of the word “Meaning” and what is the meaning of a sentence. When we read a sentence there are some words that are not important than the others, the words we drop when we summarize. The nouns in a sentence makes the most of the meaning of it.

Let’s take this example:
           The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.

Edsger Dijkstra

        In that sentence there are some words that make it meaningful for the subject it is relevant (AI). Question, Computer, Think, Submarine, and Swim But if we take these words alone, it also does not give us the meaning.
        When we summarize a sentence we do a similar thing. What we do is Re-Building a sentence from it’s own words or synonyms to give the same meaning, but in an efficient way. To do that we have to understand the meaning of the sentence and understand the functionality of each word and their subject relevancy.
        The word “Question” in the above sentence has a special functionality, putting the phrase “A computer can think” in a comparison with the second phrase about the “Submarine”. Therefore the words “Submarine” & “Swim” do not make the sentence relevant to Watercrafts or Sea.
        These are the boundary problems we have to face in this project (within our approach). But we have to make a solution within these boundaries, and that is usable.
In the above sentence there are some words like… the, of, is, a (Articles) and some words like… whether, can, no, more, interesting, than (Adjectives & Stuff). These words are not relevant to any subjects, but help us to understand the meaning. Let’s write the sentence again without the articles

           Question whether computer can think no more interesting than question whether submarine can swim.

        Ok, that does not make much harm to the sentence’s meaning. That means a sentence can still give it’s meaning without the articles. But we still need the Adjectives & Stuff to understand it (even they does not have a subject relevancy).
        Now you will understand that the Meaning is a broader subject than the Subject relevancy. So we are going to build a solution to understand the subject relevancy not the meaning.

References
Wikipedia The free encyclopaedia

Internet is the most valuable asset of this world. It is the world’s fastest growing multimedia database. The query tools to enter in to these recourses are the search engines. But we all know that due to various economical and political reasons these search engines does not always gives us what we need. On the other hand we have to spend a long time before getting the right information we need. If we have a mechanical tool to read through the results and their destination pages to filter the right information we need, it will save our time. Most people does not take the best use of the Internet just because the irrelevancy of the results given by search engines. They think the going through search results and finding as an overhead.

We are not going to build another search engine since there are enough search engines in this world. Therefore we are going to hook in to an existing search engine to get links. The software will read each sentence and decide the relevant subject field of the sentence or paragraph through the MRM dictionary.

There is a suggestion to integrate a voice engine and/or a voice recognition system to improve the usability of the software. And also it will be better if we could use a method to simplify and/or optimize sentences for the readability and/or efficiency. It will save time and let us read what we exactly need to know. If we could make this over the expectations it will be a good back end for a more realistic artificial intelligence.

The primary object of this project is to improve the usability of the Internet.

AKA => MRM Dictionary

Prerequisites to read: Relevancy of a word to a Human Subject Field

        This dictionary is for a computer to understand the meaning of a word in terms of subjects it is relevant to. This dictionary could be able to grow itself, by finding which words comes with which words in common, and what is the relevant subject for each word, or what is the subject relevancy of a word as the words it comes with.

As an example…

MRM Dictionary

        Here, to understand it easier we have used only three dimensions. Actually this is a multidimensional matrix for an n number of axis (subjects). Don’t try to apply it for a multidimensional graph to understand. If it works for three dimensions it will work for n dimensions.

        We can use 1:N mapping in a simple database or a flat file for more efficiency in speed. We have to discuss on that. If we have to process these queries faster we may have to build our own data retrieving technique other than conventional databases.

        You will find that there is a small problem in the word “oil”; that it relates to “Automobile” as well As a “Food”. Therefore there can be problems in that. But we are taking the average of these values for a whole sentence to minimize such problems.

    Every word we understand has a meaning to us in one or many subjects. The relevancy of a word to a subject may differ from person to person, according to their subject interest. But we are going to describe this for all the people generally.
This is basically about mapping the relationship between…

                                                Word ==> Subject

    As there are many subjects we are interested on we have to represent this in 1:N One to many database. But for the efficiency we have a second thought to use a flat file where there is a matrix.

    Why we are going to use a matrix is that we are going to map the Relevancy of a Word to a Subject. Actually the arrow in the above diagram will contain a numerical value for that. We put a Weight for the relevancy.

    For an example the word “Car” will be 100% relevant to the subject “Automobile” and 0% relevant to “Art” or “Education” and the Word “Pizza” will be 100% relevant for the subject “Food” and 0% for “Science”

N.B. This document is a prerequisite to go to the level 3

We have to go in step-by-step approach. Then we have to plan levels to complete one by one. After an each level we will be able to deliver a version of software that is usable. At the end we are going to make the software to learn itself new words and the relevancy to human subject fields, and able to categorize sentences and paragraphs of information to categories.

In the first level we are making something little more than a web browser. There will be a text field to enter keywords and a search button instead of the address bar. When the user enters keywords it will send search requests to Google or Yahoo or both. Then it will filter out the search results links on a page.

In the second level we are going to develop this to visit the resulted pages simultaneously and filter text information on the pages. It will separate the paragraphs and sentences and assign them to arrays to process further.

Before going to further levels we need to make another expert system, for the computer to understand the relevancy of a word to a specific Human Subject Field. We will discuss about the functionality of this module in MRM Dictionary document. For now, consider it as a “closed module that will work”. When we query for a word it will return an array of values telling what is that word mean to a human being. Read Relevancy of a Word to a Human Subject Field Document for further information.

By using the MRM dictionary we are going to make the third level version of the software; that will categorize the search results to subjects. It will query for each word in a sentence for the relevancy to subjects. The result will be a table of subjects and the percentage relevancy. We have to make the average and sort the subjects that the sentence is appropriate.