Which hash string function should I use in CryEngine?

#1
I'm confused that there are several hash string functions:
file: ..\Code\CryEngine\CryCommon\CryString\StringUtils.h

Code: Select all

//! Calculates a hash value for a given string. inline uint32 CalculateHash(const char* str) //! This function is Unicode agnostic and locale agnostic. inline uint32 HashString(const char* string) ....
My purpose is just convert string to int so that I can improve performance. But what is the different between CalculateHash & HashString?
Of course the hash string is guaranteed to be consistent through every machines under any circumstances, right?
Small tips
How to add an image to a forum post
[C++] How to smoothly turn your character

Re: Which hash string function should I use in CryEngine?

#3
Thanks for the good tips :) So I see that we should use this macro CONST_TEMP_STRING

Code: Select all

const char *szKey= "Test";

map< string, int >::const_iterator iter = m_values.find( CONST_TEMP_STRING( szKey ) ); // Good way

map< string, int >::const_iterator iter = m_values.find( szKey ); // Bad way, don't do it like this!
But don't you think using Hash string better than using map< string, int > like in the example above?
Small tips
How to add an image to a forum post
[C++] How to smoothly turn your character

Re: Which hash string function should I use in CryEngine?

#4
The code you posted that is from the documentation for CryString demonstrates how to get an iterator to a key/value pair from a map of <string, int> pairs. This does not turn your string into an int.

The hashing functions use known algorithms to "parse" a string and produce a fixed size output. In the examples you gave the output would be a 32bit (unsigned) integer. This means that collisions can occur if your string is too long. Technically, depending on the algorithm used it doesn't necessarily collide when the possibilities of the string exceed the 32 bit integer limit (4294967295).

Even with a perfect algorithm, you are guaranteed to be at risk of collisions when you hash a string (a-z only) longer than 7 characters, since with 7 characters there is a possible 8031810176 different combinations. you may want to check a wiki describing hash collisions in more detail. Mostly, it shouldn't be a problem, since most strings (I assume) will be readable English to some degree and minor variations will produce very different hashes.

Some has functions will be faster than others, with more or less features. (Unicode support for ex.).
Uniflare
CRYENGINE Community Coordinator
Here to help the community and social channels grow and thrive.

My personal belongings;
Beginner Guides | My GitHub | Splash Plugin

Re: Which hash string function should I use in CryEngine?

#6
The hashing functions use known algorithms to "parse" a string and produce a fixed size output. In the examples you gave the output would be a 32bit (unsigned) integer. This means that collisions can occur if your string is too long. Technically, depending on the algorithm used it doesn't necessarily collide when the possibilities of the string exceed the 32 bit integer limit (4294967295).

Even with a perfect algorithm, you are guaranteed to be at risk of collisions when you hash a string (a-z only) longer than 7 characters, since with 7 characters there is a possible 8031810176 different combinations. you may want to check a wiki describing hash collisions in more detail. Mostly, it shouldn't be a problem, since most strings (I assume) will be readable English to some degree and minor variations will produce very different hashes.
My code is like this:

Code: Select all

#define HASH_EVENT_RESUME_CLICKED 3698067075//1316831589 //btnResume_Clicked
#define HASH_EVENT_EXIT_CLICKED 2073346316//857729172 //btnExit_Clicked
#define HASH_EVENT_BTN_SOUND_HOVER 3336173205//128353540 //btnSound_Hover
#define HASH_EVENT_BTN_SOUND_PRESS 1103445500//3048322957 //btnSound_Press

uint32 eventName = CryStringUtils::CalculateHash(event.sDisplayName);//CryStringUtils::HashString(event.sDisplayName);

switch (eventName)
{
case HASH_EVENT_RESUME_CLICKED:

break;
case HASH_EVENT_EXIT_CLICKED:
gEnv->pSystem->Quit();
break;
case HASH_EVENT_BTN_SOUND_HOVER:

break;
case HASH_EVENT_BTN_SOUND_PRESS:

break;
default:
break;
}
So in theory, I should recognize hash collisions at the time I write the code (the #define one), the only thing I'm afraid is: on some strange console machine, the hash string generated at run time won't match what I define at compiled time.
Some has functions will be faster than others, with more or less features. (Unicode support for ex.).
But base on the comment of the 2 functions:
CalculateHash - Calculates a hash value for a given string.
HashString - This function is Unicode agnostic and locale agnostic.
I don't know which one is better, (look like HashString won't care about Unicode & locale. But what's about CalculateHash)
I've been using a CHashedString whenever I need something to represent a hashed string and so far it's worked everywhere I've needed it. Just construct one from a string or char* and off you go.
Thank ivanhawkes, CHashedString seems interesting, I'll consider using it in complicated case.
Some older parts of my code which were basically lifted from GameSDK do have calls to CryStringUtils::HashString and use CryHash - though I expect to deprecate most of that over time.
Why do you expect that? Is the code of CryStringUtils::HashString bad?
Small tips
How to add an image to a forum post
[C++] How to smoothly turn your character

Re: Which hash string function should I use in CryEngine?

#7
Some older parts of my code which were basically lifted from GameSDK do have calls to CryStringUtils::HashString and use CryHash - though I expect to deprecate most of that over time.
Why do you expect that? Is the code of CryStringUtils::HashString bad?
The code is probably fine but I prefer to unify my usage of code, and that means replacing code within my own project (or snipped from GameSDK) with new alternatives in the main code base when they become available. There really shouldn't be three different ways to calculate a string hash, it creates confusing and questions about compatibility and violates the DRY principle.

Who is online

Users browsing this forum: No registered users and 1 guest

cron