## Generating primes

Here’s my implementation of the sieve of Eratosthenes in pyhton:

cur_num = 2 multiples = {2 : 2} while True: is_prime = True for v in multiples: nv = multiples[v] - 1 if nv == 0: is_prime = False multiples[v] = v else: multiples[v] = nv if is_prime == True: print cur_num multiples[cur_num] = cur_num cur_num = cur_num + 1

The difference with the pseudocode found on the wikipedia page is that it doesn’t try to reject numbers on a fixed range, it just compute every occurrence of multiples of known primes at every step. I based my implementation on the fact that every non-prime number is composed of *n* primes multiplied together where *n>1*, in a unique way. My previous post on prime numbers used that fact to implement a quick RTTI *IsChildOf* test.

How does it works? For every prime number encountered, a counter equal to that number is created (in the *multiples* map). Then, at each iteration, we decrement every counters. When a counter reaches zero, it means that the current number can be divided by this prime number, thus it is not prime. When all counters are different from zero, the current number must be a prime, since there’s no known prime divisor to compose it.

A nice property of this implementation is that when we encounter a non-prime number, all counters that are currently at zero represents all the primes that are used to compose the current non-prime number. Here’s the updated code:

cur_num = 2 multiples = {2 : 2} while True: is_prime = True for v in multiples: nv = multiples[v] - 1 if nv == 0: is_prime = False multiples[v] = v else: multiples[v] = nv if is_prime == True: print cur_num, "[prime]" multiples[cur_num] = cur_num else: primes = list() for v in multiples: if multiples[v] == v: primes.append(v) print cur_num, "[nonprime] ", primes cur_num = cur_num + 1

Which outputs this:

... 83 [prime] 84 [nonprime] [2, 3, 7] 85 [nonprime] [5, 17] 86 [nonprime] [2, 43] 87 [nonprime] [3, 29] 88 [nonprime] [2, 11] 89 [prime] 90 [nonprime] [2, 3, 5] 91 [nonprime] [7, 13] 92 [nonprime] [2, 23] 93 [nonprime] [3, 31] 94 [nonprime] [2, 47] 95 [nonprime] [5, 19] 96 [nonprime] [2, 3] 97 [prime] ...

This is not the best implementation for this algorithm, as it was mainly done to test the idea I had in mind and to learn python. The main problem is as we search for higher primes, speed decrease quite fast due to the number of prime counters to lookup. There are several things we could do, like skip even numbers and updates counters accordingly or write a C++ implementation.

## Lessons learnt while spinning

In a recent project I worked on, I tried to minimize context switches by strategically replacing a critical section in our memory allocator by a custom spinlock that looked like this:

class SpinLock { volatile tInt LockSem; public: FORCEINLINE SpinLock() : LockSem(0) {} FORCEINLINE tBool Lock() { while(1) { // Atomically swap the lock variable with 1 if it's currently equal to 0 if(!InterlockedCompareExchange(&LockSem, 1, 0)) { // We successfully acquired the lock MemoryBarrier(); return; } } } FORCEINLINE void Unlock() { MemoryBarrier(); LockSem = 0; } };

This indeed worked as predicted, but produced an unwanted side-effect on platforms without threads priority inversion (where a high-priority thread spins in a loop while waiting for a low-priority thread to release a lock) as the Xbox 360 (described in this technical article).

It created rare situations where the game was deadlocked, and the time spent to figure-out what was going on was quite high. At first, we played with threads priorities and affinities, but it had other unwanted side-effects. Another solution tried was to yield after a certain amount of spin loops, generating a context switch that let the OS reschedule threads correctly. In the end, we reverted to a critical section.

Lesson learnt? We have to be very careful when busy-waiting, especially on OSs that don’t have priority inversion protection mechanisms. Using the OS synchronization primitives is the best way of coordinating multiple threads. When locking becomes a performance bottleneck, we have to address the underlying problem (using lockless programming, etc).

## GUID generation

I recently had to generate globally unique identifiers (GUIDs) for a personal project that I have at home. They could be generated concurrently on several threads and on multiple machines over the network, and the only rules I wanted to stick with were to avoid duplicates and be as fast as possible. Here’s what I ended up with:

void GUID::GenerateUnique() { Time Time; GetTime(Time); static tU16 StaticMask = 0; StaticMask = InterlockedIncrement(StaticMask); A = GUUID; B = GMACAddress >> 16; C = (StaticMask << 16) | (GMACAddress & 0xffff); D = (Time.Second + Time.Minute * 60 + Time.Hour * 60 * 60) | (Time.Month << 17) | (Time.Day << 22) | ((Time.Year - 2011) << 27); // Simple encryption tU32 Mask = (StaticMask << 16) | ((StaticMask * 0x0bb38435) + 0x3619636b); A = ~A ^ Mask; B = ~B ^ Mask; C = ~C ^ StaticMask; // don't encrypt the mask itself D = ~D ^ Mask; }

A, B, C and D are 32-bits unsigned values. A contains a CRC32 of the PC’s username. The PC’s MAC address is encoded in B and C parts. The remaining 16 bits of C also contains a sequential number which is atomically incremented at each call to avoid generating the same GUID when called from multiple threads. D contains a timestamp, similar to a Unix Time. At the end, a quite simple encryption scheme is used to make sure that the MAC address isn’t easily exposed directly in the GUID.

One cool thing about this is that at any point, a GUID can be decrypted and everything can be extracted from it; the CRC32 of the user who generated it, the MAC address and the timestamp:

String GUID::GetDescStr() const { tU16 StaticMask = ~(C >> 16); tU32 Mask = (StaticMask << 16) | (StaticMask * 0x0bb38435) + 0x3619636b; tU32 Mac0 = ~(B ^ Mask); tU32 Mac1 = ~(C ^ StaticMask); String Result = String::Printf( "UUID:%08x MAC:%02x-%02x-%02x-%02x-%02x-%02x ", ~(A ^ Mask), (tU32)((Mac0 & 0xff000000) >> 24), (tU32)((Mac0 & 0xff0000) >> 16), (tU32)((Mac0 & 0xff00) >> 8), (tU32)((Mac0 & 0xff)), (tU32)((Mac1 & 0xff00) >> 8), (tU32)((Mac1 & 0xff)) ); tU32 Timestamp = ~(D ^ Mask); tU32 Seconds = (Timestamp & 0x1ffff); tU32 Month = ((Timestamp >> 17) & 0x1f); tU32 Day = ((Timestamp >> 22) & 0x1f); tU32 Year = (Timestamp >> 27) + 2011; tU32 Hour = (Seconds / (60 * 60)); Seconds -= Hour * 60 * 60; tU32 Minute = (Seconds / 60); Seconds -= Minute * 60; Result += String::Printf("Date:%d-%02d-%02d %d:%02d:%02d ", Year, Month, Day, Hour, Minute, Seconds); Result += String::Printf("Mask:0x%08x", Mask); return Result; }

Size didn’t matter in my case, but this could be easily reduced to 96-bits by removing the CRC32 of the username, without changing the uniqueness of the GUIDs. If the size really matters, one could possibly reduce it to 64-bits by using the CRC32 of the username and the timestamp, but the function would possibly need to be protected by a critical-section to avoid 2 threads generating the exact same GUID.

## Working hard doesn’t mean working late

John Carmack, founder and technical director of Id Software, recently posted this on Twitter: *“The meme that people are objectively less productive in total when working more than 40 hrs a week is idiotic.”*. This is true when working 45-50 hours a week (or even more) because you’re motivated to do so. The problem starts when overtime becomes “highly suggested”, directly by managers or indirectly (peer pressure).

I’ve been working in this industry for 14 years now, and when I look back at my first years, Carmack’s statement made a lot more sense. I had no wife or kids, I was mainly spending my free time at home programming anyways, so why not stay at work 60 hours a week and get free food? But time changes, I now have kids, and as soon as I have to leave later than expected every day, motivation goes down a lot. This blog post really explains how I feel.

Look around you, and tell me what do you see? Programmers working 60-70 hours per week, arriving late in the morning, taking long coffee breaks and laughing looking at funny cats on youtube, or focused programmers with headsets that comes early in the morning, working hard on their tasks 40-45 hours per week, getting the job done? What do you prefer your team to be composed of?

## Prime numbers

Prime numbers are a very interesting topic. I often find myself quite uncomfortable when setting my car’s radio volume on an even number, and I really feel better when it’s a prime number. But this is a complete different story.

So, what are the uses of prime numbers in programming? Cryptography, random numbers generation, better hashing functions, etc. One thing that I recently discovered about prime numbers is that if you multiply *n* different primes together, it creates a special composite number (if anyone knows the real name of those, please tell me) that has a quite interesting property: there’s a way to know if a certain prime *p* was used or not to compose it simply by dividing by *p*. If the result is an integer, then *p* was used to compose the number. Take this example:

- 3 * 7 * 13 * 53 * 227 = 3284463

To know if prime number 19 was used to compose it, we do:

- 3284463 / 19 = 172866.474

Since the result is not an integer, 19 wasn’t used to compose the number 3284463. But, if we take a look at:

- 3284463 / 3 = 1094821

3284463 / 7 = 469209

3284463 / 13 = 252651

3284463 / 53 = 61971

3284463 / 227 = 14469

A practical usage of this property is for a RTTI system. By assigning a different prime number to each class in the hierarchy, and by multiplying each child’s class value by it’s parent value, we can determine if a class *A* derives from class *B* simply by dividing its composite number by *B*‘s number, instead of going through all the parents chain:

Bool Class::IsChildOf(const Class* B) const { return ((Prime % B->Prime) == 0); }