Hello everyone!
This is the second installment in my tnfy.link series – a deep dive into yet another URL shortener! This post focuses on the intricacies of short link generation. While seemingly simple, selecting the optimal method presents unique challenges.
Essentially, generating a short link involves creating a concise, unique identifier for each long URL. This ID must satisfy several criteria:
After thorough investigation, I've identified four primary methods for short link creation. Let's examine them in detail.
The most straightforward method utilizes random byte generation and subsequent encoding. However, it's crucial to differentiate between pseudo-random and cryptographically secure random number generation.
Go's math/rand
package offers a pseudo-random number generator (PRNG). Using the same seed (initial value) consistently produces the same number sequence. While adequate for many applications, it's unsuitable for secure or unpredictable link generation.
For enhanced security, the crypto/rand
package is preferable. It leverages system noise to generate truly random and unpredictable values – think electromagnetic noise. This guarantees high entropy, but virtual machines relying on their host for random data might experience slower generation under heavy load.
Raw random bytes aren't URL-friendly; encoding is necessary. Common encoding techniques include:
For user-friendly short links, Base58 offers an optimal balance of compactness and error resistance.
Key Points:
Hashing generates a fixed-length value from input (e.g., the long URL). While guaranteeing consistency (identical input always yields the same output), it lacks randomness. Consequently, shortening the same URL repeatedly produces identical IDs, failing the unpredictability requirement.
Adding a random salt before hashing introduces variability, but using raw random bytes becomes simpler and more efficient.
Universally Unique Identifiers (UUIDs) are widely used for unique value generation. Their default format is too long for short links, but re-encoding (e.g., in Base58) reduces size.
NanoID, an alternative, generates shorter strings (21 characters by default) using a customizable alphabet, optimizing for readability and error resistance.
Why Avoid UUIDs?
UUIDs fundamentally rely on random bytes, offering no significant advantage over directly generating random values.
Random value generation can occasionally lead to duplicates, particularly under high load or with shorter IDs. While tnfy.link isn't designed for high-load scenarios, potential issues warrant consideration.
A sequential counter inherently guarantees uniqueness. Redis, using the INCR command, enables distributed counter implementation. However, sequential IDs are predictable. Combining a sequence with random bytes resolves this, ensuring both uniqueness and unpredictability.
For instance:
Note: A sequential component might reveal the total number of links generated, potentially undesirable in some contexts.
This post explored various short link generation methods:
For most applications, Base58-encoded random bytes are sufficient. For high-load collision handling, combining random bytes with a sequential component is robust. While not yet implemented in tnfy.link's backend, it's planned as a future optional feature.
Thank you for reading! Your feedback on link generation is welcome in the comments!
Related Post
For more on my projects, see my article on SMS Gateway for Android.
The above is the detailed content of tnfy.link - Whats about ID?. For more information, please follow other related articles on the PHP Chinese website!