Please navigate to the bottom of the page for Table of Contents

Sunday, May 27, 2012

How to store user passwords using variable length random salt bytes with secure SHA512 cryptographic hashing functions

In modern web applications, the users of the site create accounts to gain access to customized data. When creating a signup/login process that requires a username and password, the credentials need to be stored and secured in a way that blocks unauthorized users. Furthermore, the passwords themselves need to be protected in case a hacker gets access to your data. In this post, we will review 3 schemes most commonly used today for storing and retrieving sensitive information.

Plain text scheme

The most basic (and unfortunately common) way to store user passwords is plain text. In this scheme, the login ID and the passwords are stored as plain text pairs in the database. When the user inputs his credentials, they are compared to the stored values and an appropriate response is returned. This is most commonly seen in legacy sites that have not yet upgraded their password schemes.

Username Password
Joe MyPassword
Nikhil IThoughtThisWasASecurePasswordButIGuessIWasWrong
John PasswordInPlainText

When a site sends you your actual password on the signup process or on retrieve a forgotten password, they are using this totally unsecure method of saving your password. Anybody who has access to the database (or the underlying storage) can see your password.

One way Hashed passwords scheme

The next level of password encryption is seen when developers first start learning about cryptography. They instantly gravitate towards algorithms that generate a one way hash of the password and store that hash in the database. Most common (but wrong) algorithms that get used are MD5 and SHA-1. These algorithms have been proven to be unsafe and should not be used. In the .NET world, the SDL (Security Development Lifecycle) has banned the use of these 2 algorithms. The recommended algorithm at this writing is SHA512.

Username Password
Joe CpYLOeOyIIGvruPyewcE49fOI9
Nikhil roKMJYrGEKrp8z5Mah7J2T0cHcMAkZmxT6hplA3i1zTjidv0h
John 0oOW0dB36QHQaovguCL7fyjYYR

The disadvantage of such a scheme is that a dictionary or brute force attack and compromise your entire user base. But using this scheme is better than storing passwords in plain text. Also, it protects your user’s confidential information from prying employees who have access to the production database.

Let’s look at C# code to see how we can build this algorithm. The first function we will look at this the encryption algorithm itself. Then we will review how to use this function to Generate the password and then how to verify it.

/// <summary>
/// This function takes an input string and generates a 
/// one way hash using the SHA512CryptoServiceProvider
/// algorithm
/// </summary>
/// <param name="input">the input string</param>
/// <returns>a byte array of the hashed value</returns>
internal static byte[] GetHashInternal(string input)
{
    using (HashAlgorithm ha = new SHA512CryptoServiceProvider())
    {
        // Convert the input string to a byte array and compute the hash.
        byte[] data = ha.ComputeHash(Encoding.UTF8.GetBytes(input));
 
        return data;
    }
}

As you can see above, the framework provides a very easy way to generated hashed bytes of an input string. Now let’s look at the wrapper functions to generate the password and verify it.



/// <summary>
/// Takes an input string and generates a hashed string
/// </summary>
/// <param name="input"></param>
/// <returns></returns>
public string GetPassword(string input)
{
    // generate encrypted bytes
    byte[] data = GetHashInternal(input);
 
    // convert to string and return
    return ByteArrayToString(data);
}
 
public bool VerifyPassword(string input, string encPassword)
{
    // regenerate the encrpted bytes
    byte[] data = GetHashInternal(input);
 
    // convert to string
    string pass = ByteArrayToString(data);
 
    // compare 
    return (String.Compare(pass, encPassword, false) == 0);
}
 
/// <summary>
/// Generates a string using Hexadecimal for a given byte array
/// </summary>
/// <param name="data"></param>
/// <returns></returns>
private static string ByteArrayToString(byte[] data)
{
    // Create a new Stringbuilder to collect the bytes
    // and create a string.
    StringBuilder sBuilder = new StringBuilder();
 
    // Loop through each byte of the hashed data 
    // and format each one as a hexadecimal string.
    for (int i = 0; i < data.Length; i++)
    {
        sBuilder.Append(data[i].ToString("x2"));
    }
 
    // Return the hexadecimal string.
    return sBuilder.ToString();
}

The wrapper functions themselves are simple and should be self explanatory.


Random and variable length salted hashed password scheme


Salt, a random set of bytes, is used to make unauthorized decrypting of a message more difficult. A dictionary attack is an attack in which the attacker attempts to decrypt an encrypted message by comparing the encrypted value with previously computed encrypted values for the most likely keys. This attack is made much more difficult by the introduction of salt, or random bytes, at the beginning or end of the password before the key derivation.












UsernamePassword
JoeCpYLOeOyIIGvruPyewcE49fOI9/s3WkFDM2KT+UYPyaNrIFqAu/roKMJYrGEKrp8z5Mah7J2T0cHcMAkZmxT6hplA3i1zTjidv0h
NikhilCTtLOq4Zx8oGGbg+3/0oOW0dB36QHQaovguCL7fyjYYR+d4wYhLCu/rRzZ5a2ENgy320+bWE8eLIHhxk1yGvNqKnGohIx0ubhJM=

This password algorithm scheme employs the following steps to secure the use data:


1. Generates a truly random number between 8 and 24. This will be used to determine the length of the random salt.


private static int GetTrueRandomNumber()
{
    // Because we cannot use the default randomizer, which is based on the
    // current time (it will produce the same "random" number within a
    // second), we will use a random number generator to seed the
    // randomizer.
 
    // Use a 4-byte array to fill it with random bytes and convert it then
    // to an integer value.
    byte[] randomBytes = new byte[4];
 
    // Generate 4 random bytes.
    RNGCryptoServiceProvider rng = new RNGCryptoServiceProvider();
    rng.GetBytes(randomBytes);
 
    // Convert 4 bytes into a 32-bit integer value.
    int seed = (randomBytes[0] & 0x7f) << 24 |
                randomBytes[1] << 16 |
                randomBytes[2] << 8 |
                randomBytes[3];
 
    // Now, this is real randomization.
    Random random = new Random(seed);
 
    // return a random number between 8 and 24
    return random.Next(8, 24);
}

2. Using this variable length, it then generates a random salt.


private static byte[] GetSaltBytes(string input, int saltSize)
{
    byte[] saltBytes = null;
    // get some random bytes
    using (Rfc2898DeriveBytes rdb = new Rfc2898DeriveBytes(input, saltSize, 1000))
    {
        // get salt
        saltBytes = rdb.Salt;
    }
    return saltBytes;
}

3. Next the user input is encrypted using this variable length random salt bytes using SHA512 crypto algorithm. In addition, the algorithm performs 1000 (or more) passes over the hashed output to provide a higher level of security.


private static string Encrypt(string input, byte[] saltBytes)
{
    // get input bytes
    byte[] inputBytes = Encoding.UTF8.GetBytes(input);
 
    // create salt + input array
    byte[] saltAndInput = new byte[saltBytes.Length + inputBytes.Length];
    Buffer.BlockCopy(saltBytes, 0, saltAndInput, 0, saltBytes.Length);
    Buffer.BlockCopy(inputBytes, 0, saltAndInput, saltBytes.Length, inputBytes.Length);
 
    // hash the salt and input
    byte[] data = GetHashInternal(saltAndInput);
 
    // append the salt length and raw salt to hashed data
    byte ss = Convert.ToByte(saltBytes.Length);
 
    byte[] finalBytes = new byte[1 + saltBytes.Length + data.Length];
    finalBytes[0] = ss;
    Buffer.BlockCopy(saltBytes, 0, finalBytes, 1, saltBytes.Length);
    Buffer.BlockCopy(data, 0, finalBytes, saltBytes.Length + 1, data.Length);
 
    // convert to base64 string
    string hash = ByteArrayToString(finalBytes);
    return hash;
}


internal static byte[] GetHashInternal(byte[] input)
{
    using (HashAlgorithm ha = new SHA512CryptoServiceProvider())
    {
        // copy into the return byte array
        byte[] data = new byte[input.Length];
        Array.Copy(input, data, input.Length);
 
        // process this atleast 1000 times
        for (int i = 0; i < 1000; i++)
        {
            // Convert the input string to a byte array and compute the hash.
            data = ha.ComputeHash(data);
        }
        return data;
    }
}

As you can see above, the Encrypt functions does a few steps. It first converts the user input string to bytes. It then inserts the un-encrypted salt bytes at the beginning of the user bytes. Next, using the SHA512 crypto algorithm, it runs a 100 passes over the salt+input byte array to generate a strong hash. Next, it appends this variable length random un-encrypted salt to the hashed encrypted salt+input byte array and also sets the first byte to hold the length of the salt. Lastly, it converts this to a Base64 encoded string and returns it to the user.



/// <summary>
/// converts a byte array to a base 64 encoded string
/// </summary>
/// <param name="data"></param>
/// <returns></returns>
private static string ByteArrayToString(byte[] data)
{
    string output = Convert.ToBase64String(data);
    return output;
}
 
/// <summary>
/// converts a base 64 encoded string to a byte array
/// </summary>
/// <param name="data"></param>
/// <returns></returns>
private static byte[] StringToByteArray(string data)
{
    byte[] output = Convert.FromBase64String(data);
    return output;
}

Now that we have the core algorithm under our belt, let’s look at the final routines on how to generate the password and verify it.


4. Generate the password and verify functions.


public string GetPassword(string input)
{
    // get random salt size
    int saltSize = GetTrueRandomNumber();
 
    // get random salt bytes
    byte[] saltBytes = GetSaltBytes(input, saltSize);
 
    // using this totally random and variable length salt
    // generate a crypto hash of the input string
    string hash = Encrypt(input, saltBytes);
 
    return hash;
}
 
public bool VerifyPassword(string input, string encPassword)
{
    // convert encrypted password to bytes
    byte[] finalBytes = StringToByteArray(encPassword);
 
    // get salt size
    int saltSize = Convert.ToInt32(finalBytes[0]);
 
    // now get raw salt 
    byte[] saltBytes = new byte[saltSize];
    Array.Copy(finalBytes, 1, saltBytes, 0, saltSize);
 
    // using this recovered salt
    // generate a crypto hash of the input string
    string hash = Encrypt(input, saltBytes);
 
    // check for match
    bool match = (String.Compare(encPassword, hash, false) == 0);
 
    return match;
}

The GetPassword() function is self-explanatory. You first generate a random number, get the variable length random salt bytes, use this to generate the password bytes, convert to Base64 encoded string and return this value.


The VerifyPassword() has a slightly different logic. Since each encrypted password uses a variable length random salt bytes, it first needs to get that information out of the encrypted password. Then using these salt bytes, it needs to re-run the encryption algorithm to regenerate the password. Remember, this is still using the one-way hashing concepts as that is the strongest way to ensure password safety.


In a production environment, you can do some additional variations where the un-encrypted variable length random salt bytes are inserted at the end, or a random location or in a separate encrypted string in a different column.


I hope this article will be helpful in helping you secure your customer’s private data from the prying eyes of the hackers. I would love to hear your feedback.

10 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Nikhil,

    1) MD5/SHA1 are broken for digital signatures because we've found ways to make one input hash the same as another input. This doesn't affect passwords, where their use is still perfectly fine (the goal here is to be one-way, or non-recoverable). Their only weakness compared to SHA-512 is that SHA-512 is a stronger hash (which is a fine enough reason to not use them).

    2) What do you believe the advantage of a variable-length salt is over a fixed-length one? To me this results in extending the effective salt by a handful of extra bits (4, in this case), in a more complex way than just adding more bytes to a fixed-length one.

    3) Why are you using the PBKDF2 class to generate a salt, and not RNGCryptoServiceProvider?

    4) Of a somewhat interesting note, your GetHashInternal is essentially a weaker version of the PBKDF2 algorithm you invoke in GetSaltBytes. It's designed to burn CPU to make brute-forcing take a long time, but does not actually make the hash any stronger. Why don't you just use PBKDF2?

    (edited! i misread your code the first time around.)

    ReplyDelete
    Replies
    1. Hi Cory,
      Thanks for taking the time to read the code and respond.

      1. MD5 and SHA1 are banned by Microsoft SDL team. This article http://msdn.microsoft.com/en-us/magazine/ee321570.aspx is a great read. Check out the first figure where they talk about Acceptable/Recommended algorithms for Hash.

      2. A variable length salt gives extra protection against birthday attacks. And as you said, it is just a few extra bytes but makes the task of a hacker extremely difficult.

      3. Good point. Either one could be used. The actual salt bytes are not that important. Neither is the generation scheme.

      4. You are right about my HashInternal function being similar to Rfc2898DeriveBytes. On the performance tests I ran, i found that my stripped down version performed marginally better than the Rfc2898DeriveBytes.GetBytes() function. In addition, Rfc2898DeriveBytes uses HMACSHA1 which is known to be weak. Hence I went the route I outlined above.

      Delete
  3. 1. I understand Microsoft banning it outright as a policy just to keep things simpler, but your article implies they're unsafe for password hashes, which is just not correct. The MSDN article explicitly mentions the weakness is for use with digital signatures, not one-way hashing.

    2. Interesting, I've not before heard of variable-length being used to solve this problem. I'll have to do some reading!

    3. Agreed, I was just curious!

    4. The whole idea of such a loop is to burn CPU, why does perf matter? As I said above, the weakness is only for digital signatures and it's not relevant for one-way hashing or key generation.

    ReplyDelete
  4. can you help me I tried to use your method to encrypt login and register page of my site I'm new in encryption so I tried to use salt hash for make register page save user and password in sql database in hash then when user login will retrieve it from sql data base.. If you have a code project can I download or help me in any other way I'm very thankful for your help,

    ReplyDelete
  5. Nikhil,
    Thanks for your post. Not understanding all of the code, is this possible: Decrypting the encrypted password to return as a string (the actual password)?

    ReplyDelete
    Replies
    1. I've answered my own question: it can't be done - which is the point of this type of hash.
      Thanks.

      Delete
  6. Hi, nice article, but I have some questions. Like what about SQL, can I store these passwords there and pull them back to verify user input, is that safe?
    Is there any way to make passwords crypted in sql, hash them there and just verify in front end (web app in C#) if they were the same?

    ReplyDelete
  7. When I use salt with the password and then hash, wouldn't I need the salt stored somewhere so when a user enters their password and I need to see if the hashes match wouldn't I have to use the same salt with the password? Do you store the salt used in the user's account? Am I misunderstanding you?

    ReplyDelete
  8. It seems to be that using PBKDF2 is significantly simpler than using SHA512 as it gives you key stretching out of the box. Something like this: http://manyrootsofallevilrants.blogspot.co.uk/2012/12/slow-vs-fast-hashing-algorithms-in-c.html

    ReplyDelete