Page 1 of 1

Which hash string function should I use in CryEngine?

Posted: Tue Jan 16, 2018 11:15 am
by HDN
I'm confused that there are several hash string functions:
file: ..\Code\CryEngine\CryCommon\CryString\StringUtils.h

Code: Select all

//! Calculates a hash value for a given string. inline uint32 CalculateHash(const char* str) //! This function is Unicode agnostic and locale agnostic. inline uint32 HashString(const char* string) ....
My purpose is just convert string to int so that I can improve performance. But what is the different between CalculateHash & HashString?
Of course the hash string is guaranteed to be consistent through every machines under any circumstances, right?

Re: Which hash string function should I use in CryEngine?

Posted: Sun Jan 28, 2018 4:11 pm
by ZhenWang
Use official CE Docs online, the page is called crystring, here is the link for you:
http://docs.cryengine.com/display/SDKDOC4/CryString

Re: Which hash string function should I use in CryEngine?

Posted: Mon Jan 29, 2018 5:13 am
by HDN
Thanks for the good tips :) So I see that we should use this macro CONST_TEMP_STRING

Code: Select all

const char *szKey= "Test";

map< string, int >::const_iterator iter = m_values.find( CONST_TEMP_STRING( szKey ) ); // Good way

map< string, int >::const_iterator iter = m_values.find( szKey ); // Bad way, don't do it like this!
But don't you think using Hash string better than using map< string, int > like in the example above?

Re: Which hash string function should I use in CryEngine?

Posted: Tue Jan 30, 2018 4:06 pm
by Cry-Flare
The code you posted that is from the documentation for CryString demonstrates how to get an iterator to a key/value pair from a map of <string, int> pairs. This does not turn your string into an int.

The hashing functions use known algorithms to "parse" a string and produce a fixed size output. In the examples you gave the output would be a 32bit (unsigned) integer. This means that collisions can occur if your string is too long. Technically, depending on the algorithm used it doesn't necessarily collide when the possibilities of the string exceed the 32 bit integer limit (4294967295).

Even with a perfect algorithm, you are guaranteed to be at risk of collisions when you hash a string (a-z only) longer than 7 characters, since with 7 characters there is a possible 8031810176 different combinations. you may want to check a wiki describing hash collisions in more detail. Mostly, it shouldn't be a problem, since most strings (I assume) will be readable English to some degree and minor variations will produce very different hashes.

Some has functions will be faster than others, with more or less features. (Unicode support for ex.).

Re: Which hash string function should I use in CryEngine?

Posted: Wed Jan 31, 2018 6:51 am
by ivanhawkes
I've been using a CHashedString whenever I need something to represent a hashed string and so far it's worked everywhere I've needed it. Just construct one from a string or char* and off you go. Some older parts of my code which were basically lifted from GameSDK do have calls to CryStringUtils::HashString and use CryHash - though I expect to deprecate most of that over time.

Re: Which hash string function should I use in CryEngine?

Posted: Sat Feb 03, 2018 9:18 am
by HDN
The hashing functions use known algorithms to "parse" a string and produce a fixed size output. In the examples you gave the output would be a 32bit (unsigned) integer. This means that collisions can occur if your string is too long. Technically, depending on the algorithm used it doesn't necessarily collide when the possibilities of the string exceed the 32 bit integer limit (4294967295).

Even with a perfect algorithm, you are guaranteed to be at risk of collisions when you hash a string (a-z only) longer than 7 characters, since with 7 characters there is a possible 8031810176 different combinations. you may want to check a wiki describing hash collisions in more detail. Mostly, it shouldn't be a problem, since most strings (I assume) will be readable English to some degree and minor variations will produce very different hashes.
My code is like this:

Code: Select all

#define HASH_EVENT_RESUME_CLICKED 3698067075//1316831589 //btnResume_Clicked
#define HASH_EVENT_EXIT_CLICKED 2073346316//857729172 //btnExit_Clicked
#define HASH_EVENT_BTN_SOUND_HOVER 3336173205//128353540 //btnSound_Hover
#define HASH_EVENT_BTN_SOUND_PRESS 1103445500//3048322957 //btnSound_Press

uint32 eventName = CryStringUtils::CalculateHash(event.sDisplayName);//CryStringUtils::HashString(event.sDisplayName);

switch (eventName)
{
case HASH_EVENT_RESUME_CLICKED:

break;
case HASH_EVENT_EXIT_CLICKED:
gEnv->pSystem->Quit();
break;
case HASH_EVENT_BTN_SOUND_HOVER:

break;
case HASH_EVENT_BTN_SOUND_PRESS:

break;
default:
break;
}
So in theory, I should recognize hash collisions at the time I write the code (the #define one), the only thing I'm afraid is: on some strange console machine, the hash string generated at run time won't match what I define at compiled time.
Some has functions will be faster than others, with more or less features. (Unicode support for ex.).
But base on the comment of the 2 functions:
CalculateHash - Calculates a hash value for a given string.
HashString - This function is Unicode agnostic and locale agnostic.
I don't know which one is better, (look like HashString won't care about Unicode & locale. But what's about CalculateHash)
I've been using a CHashedString whenever I need something to represent a hashed string and so far it's worked everywhere I've needed it. Just construct one from a string or char* and off you go.
Thank ivanhawkes, CHashedString seems interesting, I'll consider using it in complicated case.
Some older parts of my code which were basically lifted from GameSDK do have calls to CryStringUtils::HashString and use CryHash - though I expect to deprecate most of that over time.
Why do you expect that? Is the code of CryStringUtils::HashString bad?

Re: Which hash string function should I use in CryEngine?

Posted: Tue Feb 06, 2018 11:52 am
by ivanhawkes
Some older parts of my code which were basically lifted from GameSDK do have calls to CryStringUtils::HashString and use CryHash - though I expect to deprecate most of that over time.
Why do you expect that? Is the code of CryStringUtils::HashString bad?
The code is probably fine but I prefer to unify my usage of code, and that means replacing code within my own project (or snipped from GameSDK) with new alternatives in the main code base when they become available. There really shouldn't be three different ways to calculate a string hash, it creates confusing and questions about compatibility and violates the DRY principle.

Re: Which hash string function should I use in CryEngine?

Posted: Thu Feb 08, 2018 4:44 pm
by sunnlok
For your use case i would recommend CalculateHash.
Its a standard Crc32 implementation and should produce the same results as other crc32 hashes. I am using it as well for a similar purpose.
If you are on ce 5.4 you can also use Crc32::Compute_CompileTime to generate hashes at compile time instead of predefining them.

Re: Which hash string function should I use in CryEngine?

Posted: Fri Feb 09, 2018 1:43 am
by HDN
Thank sunnlok, that's information is really helpful :)