How to handle text input

Non-platform specific questions.

How to handle text input

Postby phyzix5761 » Mon May 28, 2012 10:51 pm

I am trying to write a language translation program and need to know what is the best way to handle text input. I am aware of how to convert an actor into a text actor but I have never used one. What is best way to compare what the user inputs with a database? Also, can the text the user inputs be separated by word?
phyzix5761
 
Posts: 261
Joined: Sun Feb 27, 2011 4:28 am
Score: 18 Give a positive score

Re: How to handle text input

Postby skydereign » Mon May 28, 2012 11:24 pm

Really you shouldn't use gE for something like that. We aren't given the strtok function, which would help immensely when parsing a string. But, you can use gE's built in text input system, and just use the actor's text variable. To compare strings, you have to use the strcmp function. The text variable is a string, so you can either copy its contents into another string, or you can just use the variable directly. strcmp returns a 0 if the two strings you pass it are the same.

You could write a function that acts like strtok, or just use sscanf to extract words from the string, but that could get a little complex.

I wouldn't suggest using just strcmp and an array. Just say you have 1000 words you can translate. The array solution to this would require you to loop through the entire array (or until you find the word). That means you may end up using strcmp 1000 times, which isn't very efficient. You can setup a hash for this, which could, if written well, result in a single search for most words.
User avatar
skydereign
 
Posts: 3510
Joined: Mon Jul 28, 2008 8:29 am
Score: 589 Give a positive score

Re: How to handle text input

Postby phyzix5761 » Mon May 28, 2012 11:56 pm

Any idea (if you're motivated) how I can write an strtok function? Or how sscanf would work?
phyzix5761
 
Posts: 261
Joined: Sun Feb 27, 2011 4:28 am
Score: 18 Give a positive score

Re: How to handle text input

Postby phyzix5761 » Tue May 29, 2012 12:37 am

well I figured out how to use sscanf. Now my question is how would you handle the word database if not with an array? You say hash but I'm not sure what that is.
phyzix5761
 
Posts: 261
Joined: Sun Feb 27, 2011 4:28 am
Score: 18 Give a positive score

Re: How to handle text input

Postby skydereign » Tue May 29, 2012 3:07 am

It isn't that the array can't be used, its just you really shouldn't use an iterative way of going through the array. You should use what's called a hash function. The idea is you pass it the word (called a key) and it uses it to return an index. You can then store the word in that index. That way if you want to retrieve the word, you just pass it the key, and it can tell you exactly where the word is by giving you the proper index.
User avatar
skydereign
 
Posts: 3510
Joined: Mon Jul 28, 2008 8:29 am
Score: 589 Give a positive score

Re: How to handle text input

Postby phyzix5761 » Tue May 29, 2012 1:59 pm

If I am understanding correctly the hash needs to convert the key into an index. How does it do that?
phyzix5761
 
Posts: 261
Joined: Sun Feb 27, 2011 4:28 am
Score: 18 Give a positive score

Re: How to handle text input

Postby skydereign » Tue May 29, 2012 4:24 pm

Well, the key doesn't have to be a string. It could be anything, for instance an int. Here is one of the simplest examples of a hash function.
Code: Select all
int
hash (int key)
{
    return key;
}

Of course when it is this simple you would think it isn't really necessary, but actually using a function to return the index means you can later change the hash function and not have to change anything else (one of the reasons you want to use functions). Now the method you create the index from the key is up to you. There are plenty of different ways to create the index. The above example shows a perfect hash (meaning for every unique key, there would be a unique index). When using a hashed array, you do run into some minor problems. For instance, you only have storage for a finite number of storage points.

Using the same example, just say the array was only of size 10. In that case you need the hash function to never return larger indexes.
Code: Select all
int
hash (int key)
{
    return key%10; // only returns 0-9
}

As you might note, this means that if you plug in 0 and 10, they will return the same index. That is what is called a collision. There are various methods to resolve a collision. But, a simple hash function for strings, that is pretty good is this.
Code: Select all
int
hash (char* key)
{
    int i;
    int len = strlen(key);
    int index = 0;
    for(i=0;i<len;i++)
    {
        index+=key[i];
    }

    return index%HASH_SIZE; // replacing HASH_SIZE with the size of the hashed array
}
User avatar
skydereign
 
Posts: 3510
Joined: Mon Jul 28, 2008 8:29 am
Score: 589 Give a positive score

Re: How to handle text input

Postby phyzix5761 » Tue May 29, 2012 7:51 pm

Thank you for your reply. What if I have two words that are the same length? Would that cause any problems?
phyzix5761
 
Posts: 261
Joined: Sun Feb 27, 2011 4:28 am
Score: 18 Give a positive score

Re: How to handle text input

Postby skydereign » Tue May 29, 2012 8:00 pm

phyzix5761 wrote:What if I have two words that are the same length? Would that cause any problems?

Think about it. As I said, you aren't going to be able to create a perfect hash function, as there will be collisions in almost any string based hash you use. So, if it is just adding up the characters then if the strings have the same characters, there will be a collision. For instance the "words" ab and ba will be stored in the same location (causing a collision). Words can also collide due to the modulo and just summing the characters up (different length words can still collide). But, that is why you add some form of collision resolution. For instance you can use a linked list, a bucket, or a linear system. The list would allow for unlimited number of entries, while the others wouldn't.
User avatar
skydereign
 
Posts: 3510
Joined: Mon Jul 28, 2008 8:29 am
Score: 589 Give a positive score

Re: How to handle text input

Postby lcl » Wed Jun 06, 2012 7:23 pm

Hey skydereign, I got quite interested of that hash function system.
Could you explain how one is supposed to store the words in to the right indexes? Won't there be quite many indexes not used, because the numbers that the characters return are quite large?

Thank you in advance! :)
User avatar
lcl
 
Posts: 2339
Joined: Thu Mar 25, 2010 5:55 pm
Location: Finland
Score: 276 Give a positive score

Re: How to handle text input

Postby skydereign » Thu Jun 07, 2012 12:13 am

I actually had to write a hash adt for a group project in one of my classes. I can post that with some explanation and comments if you want. There's nothing in it that gE can't do, so I could upload a ged demonstrating it as well. If so, you can expect it tomorrow or the day after. But in the meantime, the sum I mentioned isn't the actual index, as it is shrunk by using modulus.
User avatar
skydereign
 
Posts: 3510
Joined: Mon Jul 28, 2008 8:29 am
Score: 589 Give a positive score

Re: How to handle text input

Postby lcl » Sun Jun 10, 2012 11:43 pm

I would be thankful if you sent me a ged demonstrating it :D
User avatar
lcl
 
Posts: 2339
Joined: Thu Mar 25, 2010 5:55 pm
Location: Finland
Score: 276 Give a positive score


Return to General

Who is online

Users browsing this forum: No registered users and 1 guest

cron