Skip to main content

Hashing

Ajay-Dhangar
EditReport

Definition:

Hashing is a process that transforms input data (or a message) into a fixed-size string of characters, which is typically a sequence of numbers and letters. The output, known as the hash value or hash code, is generated by a hash function. Hashing is widely used in various applications, including data integrity verification, password storage, and digital signatures.

Characteristics:

  • Deterministic:

    • The same input will always produce the same hash output. This property allows for consistent verification of data integrity.
  • Fixed Size:

    • Regardless of the size of the input data, the output hash will always be of a fixed length, making it easier to handle and compare.
  • Fast Computation:

    • Hash functions are designed to compute hash values quickly, allowing for efficient data processing.
  • Pre-image Resistance:

    • Given a hash output, it should be computationally infeasible to reverse-engineer the original input, ensuring data confidentiality.
  • Collision Resistance:

    • It should be difficult to find two different inputs that produce the same hash output, preventing data tampering.

Common Hashing Algorithms:

  1. MD5 (Message Digest 5):

    • Produces a 128-bit hash value and is widely used for checksums and data integrity. However, it is no longer considered secure against collision attacks.
  2. SHA-1 (Secure Hash Algorithm 1):

    • Produces a 160-bit hash value. Like MD5, SHA-1 has vulnerabilities and is not recommended for security-sensitive applications.
  3. SHA-256:

    • Part of the SHA-2 family, it produces a 256-bit hash value and is widely used in security applications and protocols, including SSL/TLS and Bitcoin.
  4. bcrypt:

    • A hashing function designed specifically for hashing passwords, incorporating a salt to protect against rainbow table attacks.

Time Complexity:

  • Hash Computation Time: O(n)O(n)
    The time complexity for computing a hash value depends on the input size, with most hashing algorithms running in linear time relative to the input length.

Space Complexity:

  • Space Complexity: O(1)O(1)
    The output size of a hash function is constant (fixed size), independent of the input size, leading to constant space complexity for storing hash values.

C++ Implementation of Hashing (Using SHA-256):

#include <iostream>
#include <openssl/sha.h>
#include <iomanip>
#include <sstream>

std::string sha256(const std::string &data) {
unsigned char hash[SHA256_DIGEST_LENGTH];
SHA256(reinterpret_cast<const unsigned char *>(data.c_str()), data.size(), hash);

std::ostringstream oss;
for (const auto &byte : hash) {
oss << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(byte);
}
return oss.str();
}

int main() {
std::string data = "Hello, World!";
std::string hashValue = sha256(data);

std::cout << "Data: " << data << std::endl;
std::cout << "SHA-256 Hash: " << hashValue << std::endl;

return 0;
}

Summary:

Hashing is a fundamental technique used in computer science and cryptography for data integrity verification, password management, and digital signatures. The use of secure hash functions, such as SHA-256, provides strong security guarantees against common attacks. Understanding hashing and its properties is essential for implementing secure systems and applications.

Telemetry Integration

Completed working through this block? Sync progress to workspace.