Page 1 of 1

How to handle text input

PostPosted: Mon May 28, 2012 10:51 pm
by phyzix5761
I am trying to write a language translation program and need to know what is the best way to handle text input. I am aware of how to convert an actor into a text actor but I have never used one. What is best way to compare what the user inputs with a database? Also, can the text the user inputs be separated by word?

Re: How to handle text input

PostPosted: Mon May 28, 2012 11:24 pm
by skydereign
Really you shouldn't use gE for something like that. We aren't given the strtok function, which would help immensely when parsing a string. But, you can use gE's built in text input system, and just use the actor's text variable. To compare strings, you have to use the strcmp function. The text variable is a string, so you can either copy its contents into another string, or you can just use the variable directly. strcmp returns a 0 if the two strings you pass it are the same.

You could write a function that acts like strtok, or just use sscanf to extract words from the string, but that could get a little complex.

I wouldn't suggest using just strcmp and an array. Just say you have 1000 words you can translate. The array solution to this would require you to loop through the entire array (or until you find the word). That means you may end up using strcmp 1000 times, which isn't very efficient. You can setup a hash for this, which could, if written well, result in a single search for most words.

Re: How to handle text input

PostPosted: Mon May 28, 2012 11:56 pm
by phyzix5761
Any idea (if you're motivated) how I can write an strtok function? Or how sscanf would work?

Re: How to handle text input

PostPosted: Tue May 29, 2012 12:37 am
by phyzix5761
well I figured out how to use sscanf. Now my question is how would you handle the word database if not with an array? You say hash but I'm not sure what that is.

Re: How to handle text input

PostPosted: Tue May 29, 2012 3:07 am
by skydereign
It isn't that the array can't be used, its just you really shouldn't use an iterative way of going through the array. You should use what's called a hash function. The idea is you pass it the word (called a key) and it uses it to return an index. You can then store the word in that index. That way if you want to retrieve the word, you just pass it the key, and it can tell you exactly where the word is by giving you the proper index.

Re: How to handle text input

PostPosted: Tue May 29, 2012 1:59 pm
by phyzix5761
If I am understanding correctly the hash needs to convert the key into an index. How does it do that?

Re: How to handle text input

PostPosted: Tue May 29, 2012 4:24 pm
by skydereign
Well, the key doesn't have to be a string. It could be anything, for instance an int. Here is one of the simplest examples of a hash function.
Code: Select all
int
hash (int key)
{
    return key;
}

Of course when it is this simple you would think it isn't really necessary, but actually using a function to return the index means you can later change the hash function and not have to change anything else (one of the reasons you want to use functions). Now the method you create the index from the key is up to you. There are plenty of different ways to create the index. The above example shows a perfect hash (meaning for every unique key, there would be a unique index). When using a hashed array, you do run into some minor problems. For instance, you only have storage for a finite number of storage points.

Using the same example, just say the array was only of size 10. In that case you need the hash function to never return larger indexes.
Code: Select all
int
hash (int key)
{
    return key%10; // only returns 0-9
}

As you might note, this means that if you plug in 0 and 10, they will return the same index. That is what is called a collision. There are various methods to resolve a collision. But, a simple hash function for strings, that is pretty good is this.
Code: Select all
int
hash (char* key)
{
    int i;
    int len = strlen(key);
    int index = 0;
    for(i=0;i<len;i++)
    {
        index+=key[i];
    }

    return index%HASH_SIZE; // replacing HASH_SIZE with the size of the hashed array
}

Re: How to handle text input

PostPosted: Tue May 29, 2012 7:51 pm
by phyzix5761
Thank you for your reply. What if I have two words that are the same length? Would that cause any problems?

Re: How to handle text input

PostPosted: Tue May 29, 2012 8:00 pm
by skydereign
phyzix5761 wrote:What if I have two words that are the same length? Would that cause any problems?

Think about it. As I said, you aren't going to be able to create a perfect hash function, as there will be collisions in almost any string based hash you use. So, if it is just adding up the characters then if the strings have the same characters, there will be a collision. For instance the "words" ab and ba will be stored in the same location (causing a collision). Words can also collide due to the modulo and just summing the characters up (different length words can still collide). But, that is why you add some form of collision resolution. For instance you can use a linked list, a bucket, or a linear system. The list would allow for unlimited number of entries, while the others wouldn't.

Re: How to handle text input

PostPosted: Wed Jun 06, 2012 7:23 pm
by lcl
Hey skydereign, I got quite interested of that hash function system.
Could you explain how one is supposed to store the words in to the right indexes? Won't there be quite many indexes not used, because the numbers that the characters return are quite large?

Thank you in advance! :)

Re: How to handle text input

PostPosted: Thu Jun 07, 2012 12:13 am
by skydereign
I actually had to write a hash adt for a group project in one of my classes. I can post that with some explanation and comments if you want. There's nothing in it that gE can't do, so I could upload a ged demonstrating it as well. If so, you can expect it tomorrow or the day after. But in the meantime, the sum I mentioned isn't the actual index, as it is shrunk by using modulus.

Re: How to handle text input

PostPosted: Sun Jun 10, 2012 11:43 pm
by lcl
I would be thankful if you sent me a ged demonstrating it :D