Happiness is a Choice

What a great post from Bronnie Ware, which prompted her to write a book, in which she lists the five most common regrets that terminally ill patients had on their way out of this world. “I wish that I let myself be happier” is the one I find most interesting. As Bronnie says:

 Many did not realise until the end that happiness is a choice. They had stayed stuck in old patterns and habits. The so-called ‘comfort’ of familiarity overflowed into their emotions, as well as their physical lives. Fear of change had them pretending to others, and to their selves, that they were content. When deep within, they longed to laugh properly and have silliness in their life again.

Play is such a important part of human existence. Like love and art and mathematics, it skips past culture and economics and politics. It slips through language barriers. It jumps over random bumps in the road. And fills us with silliness and wonder.

OMGZynga

GamesBeat has a nice quick interview with Dan Porter and David Ko about Zynga’s reported $200m acquisition of Draw Something OMGPOP. Grats to both companies, but $200m does seem a bit pricey. I guess if you’re Zynga — with “3,000 people working on games” and you “can’t make every hit” (this is an innocuous but weird thing to say, Dean), it’s simply market share/conquest, which makes the validity of the number less relevant.

Ah, consolidation. It’s so utterly human and happens to every industry. My co-founder, Laddie Ervin, and I spend a lot of time talking about it. I think this acquisition is important enough to be a data point on the consolidately mobile and social gaming curve.

Delayed Communication

Mark Suster has a great series on negotiations going. His post yesterday, A Quick Hack for Speeding up Term Sheet and other Negotiations, makes the (fairly obvious) point that getting everyone into a room together to negotiate, while not easy to do, is the best hack for avoiding delays. Mark illustrates this while discussing his very first term sheet negotiation.

Delays during contract negotiations can be crushingly frustrating: If the contract involves money (most do), the longer it takes, the bigger the delay before capital can be deployed and transformed into ROI. Why does a “signing party”, as Mark puts it, work so well? Again, it’s obvious — get us all together in one room with one goal and one deadline and we’re going to get something done. A face-to-face meeting is an instinctive, faster form of communication that trumps more sophisticated forms like phones calls and email. This is a great example of how form drives function.

Almost by definition, the more sophisticated the technology, the more complex its form. Tons of email wanting immediate attention, ten versions of the same document in redlines, a dozen voice mails we still haven’t had time to check — the function is communication but quickly reveals itself to be delayed communication. And in the case of a contract negotiation, delays are often regarded, by at least one party and ideally by both, as detrimental to ROI.

Delayed communication can also be very positive. A great example of that is the recent hit game Draw Something. It launched in February and is already at the top of the App Store charts. According to Carter Thomas (he has a great writeup on the game here), Draw Something has already exceeded 20 million downloads and is earning its developer, OMGPOP, $100k+ per day.

OMGPOP did something great with the core design by using delayed communication as an integral part of the experience. Unlike a game of Pictionary in the living room, on our mobile devices we need a passive delay — time — to draw pictures and guess drawings. Matching up players in real-time is too hard to do because there are too many distractions — phone calls, IM, push notifications, etc. would turn play into work since users are generating content.  For more traditional, non-user generated content games, a real-time mulitplayer structure can work, for example Eliminate from ngmoco, but even then it’s not easy.

Draw Something is not a fantastic new game mechanic. It’s not quite a new twist on an old theme and certainly not traditional gaming or storytelling in any real sense. But as more professional game designers and developers migrate over from traditional to non-traditional games, let’s hope they pay some attention to it.

Satisfy Your Inner Warlord

We had a nice recent little writeup called “Satisfy Your Inner Warlord” on World Siege: Orc Defender in Connected World magazine. Orc Defender has been doing well and is currently free-to-play on iPhone (also available in HD form on iPad). Players who love the game really love it, and the number of keeps on the map are increasing at a nice clip every day. Thanks to everyone who has played and supported the game so far — we’ve got some fun new features planned.

GDC @ SF 2012

It was a sunny, but cool, recovery weekend here in Southern Oregon after GDC in San Francisco. I believe it was my 13th GDC (it’s hard to recall for sure, but I can remember when it was called CGDC instead of GDC — yikes, that makes me feel a bit old).

We went down for the whole week this year and had a slew of meetings. Only one of them was somewhat weird/pointless, and a handful were downright impressive.

It’s interesting when doing a meet-and-greet for the first time, usually with a BizDev or DevRel person — you very quickly get a sense of how much they know about actual development and the creative process vs. the business side of things. Some are all business and it’s pretty obvious that they are more-or-less industry agnostic (do they even play games?), while others (often but not always from the PD side of the house) are real gamers and invest themselves in the design/tech/creative.

Not surprisingly, gamer types are more fun and the meetings are upbeat. The downside is that it’s more difficult to stay focused on the meeting. The more-strictly-business types can be just as fun, although if they’re representing a big company that’s growing fast they tend to be somewhat vague and boilerplate — they’re just stretched too thin. Or they’re searching for one specific thing/idea/capability and that’s all they can see (or have time to see).

We had a couple of meetings where the folks with whom we were meeting immediately noticed the coolest bits about what were showing. It’s the difference between hearing “looks beautiful” vs. “love the multiple layers of parallax and depth-of-field combined with a simple mechanic”. While rote positive feedback is a nice ego-stroke, specific feedback is awesome and energizing.

Conferences take a lot of energy due to the volume of encounters, and the “energy exchange” during meetings can be radically different depending on the people. Being a vert (sort of 50/50 intro/extrovert), meetings are a bit of a roller coaster for me — the amount of mana in my mana bar goes up and down a lot. While this happens to some extent with everyone, most people I know tend to be more DOT (Damage Over Time) or maybe AOE (Area of Effect) — e.g. energy is lost or gained at a more-or-less linear rate over the course of the conversation. I’ve noticed that in my case I can just as easily end a meeting with a full bar of mana as I can an empty bar.

In any case GDC has changed a lot over the years. Many of the sessions are down to a clean 45 minutes, and it seems as if every year presenters get better and better at staying with their slides. This is way less fun than years ago when sessions tended to run over and speakers had more off-the-Powerpoint verbal nuggets to deliver. I presented a few years ago with a colleague — we did a post-mortem on a mobile game — and we had a full hour but that was not nearly enough time to cover all the stuff we wanted to cover. I hope that GDC swings back toward longer/deeper sessions in the future — would add more to the value proposition. And as much as I like San Francisco, I find myself yearning for the somewhat less polished, geekier San Jose GDC days (and the SJ Fairmont!). Given the remote probability of either one of those things happening, maybe I should just check out GDC Austin this year instead.

 

Terrain on iOS

Over the holiday I decided to play with a new terrain implementation for an upcoming Kineplay title. Terrain rendering is nothing new and there are a lot of approaches to it — check out the dozens of implementations at the Virtual Terrain Project alone. While I’ve implemented terrain several times over the years, for mobile it’s always been an extra challenge — almost always degenerating into a small quadtree with no real LOD (too expensive),  ridiculously few triangles (polygon limits on most mobile platforms before iOS) and horribly small textures (never enough memory).

With iOS, there’s a bit more room and with a careful implementation, interactive frame rates are possible with a very large terrain that actually looks halfway realistic. There are plenty of caveats, but at least it’s more-or-less doable. This post is about the more interesting bits in the implementation I wrote over the break, with some code snippets that I hope other aspiring mobile terrain makers will find useful. Unfortunately I can’t publish a complete project since the code is integrated with our engine. If I can ever find the time, I’ll do a follow-up with a standalone project.

There are lots of interesting challenges with terrain, but the most important is LOD (Level of Detail). LOD is often deployed for other geometry in a game, but it’s crucial for terrain. Terrain LOD is how accurately the terrain is rendered, based on some condition(s). For example, a completely flat terrain only needs one quad — two triangles — to accurately render it. For a complex mountain-side, far more polygons would be needed.

This can ramp up very fast for complex morphology — in fact, maintaining good frame rates even on non-mobile platforms can be a challenge. While an Xbox 360 or PC title has orders of magnitude more polys to work with than iOS, a large, high-resolution terrain can still significantly impact the poly budget. For example, just one 256×256 grid of quads takes  66049 vertices/65536 triangles. Many grids that size would be needed to very nicely represent, say, a 2km square area. Even 64 grids and you’re already over four million tris — over a million after frustum culling.  And if the terrain is the least bit interesting, it’s all fill-rate happy.

This is where terrain LOD comes in, in a big way. There are two basic types of LOD:

1. Static LOD is the terrain’s representation based on its heightfield complexity. The fun part of static LOD is how to seamlessly join neighboring vertices at different LODs.

2. Dynamic LOD is the terrain’s representation based on the distance between the camera and the terrain’s vertices. The joy of dynamic LOD is figuring out the best way to avoid excessive “popping” from one LOD to the next.

Lots of Master’s thesii have been written about these (see the Virtual Terrain Project link above) — more on dynamic LOD since it tends to be harder to get right and has to be done every frame. For this blog post, I’ll focus on static LOD. For terrain on iOS, it’s a must and how you implement it will have an impact on how you implement dynamic LOD. I’ll cover my dynamic LOD scheme in a future post. For my implementation, the important steps for static LOD were:

Heightfield Data. I generated an 16-bit .raw file using Grome (a great program from Quad Software which I recommend, although the interface will take some getting used to). 16-bit is important: For a large terrain, an 8-bit heightfield simply does not provide enough resolution to accurately represent hills and mountains, for example. Note that the dimensions of the heightmap are 2n+1 — not 2n (for example, 1025×1025 instead of 1024×1024). If you think of each pixel as the height of a vertex at a given (x, y) position, with the sub-pixel space between pixels as a quad, this make sense. You can also visualize a pixel as a quad with four sub-pixels in each corner, with the sub-sub-pixel space defining a sub-quad, although IMO that’s overkill. Loading a 16-bit .raw file is easy, by the way (in this example, mapDataLen and mapData are class variables — an int and GLushort*, respectively):
 

- (void) LoadRaw16:(NSString*)filename
{
    mapDataLen = GetFileLen(filename);
    if (mapDataLen > 0)
    {
        FILE* file = GetBundleFileBinary(filename);
        mapData    = malloc(mapDataLen * sizeof(GLushort));
        fread(mapData, 1, mapDataLen, file);
        fclose(file);
    }
}

 

Chunk It Up. A common idea behind most real-time terrain engines is to create chunks of terrain, at some minimum and maximum resolution (based on static and dynamic LOD requirements, for example). Chunks are a great way to setup the terrain, giving us a way to quickly determine what parts of the terrain to render. In my implementation, I pass in the number of chunks I want to create in the init function — 32, for example, which would create 32×32 or 1024 chunks of terrain. Each chunk will end up with some resolution — the number of quads in the chunks — based on the static LOD requirements of the terrain morphology. Call it the maxLevel  — it specifies the max number of nxn quads in each chunk. At a maxLevel of 16, for example, one chunk of terrain will use 256 quads, or 512 triangles, to render it.

There are several ways to compute the static LOD for a chunk. A fairly easy, fast way is to guess at how co-planar (some number of triangles in) the chunk is based on its four corner height values. Sample each corner from the raw height data, add ’em up and take the average. In practice you’ll need a bias to “force” the chunk’s resolution higher or lower, depending on how much overall resolution you want to see and how much rendering time you’re willing to give over to terrain. In my implementation, I also found it convenient to normalize my co-planarity value to (0, 1) by computing the min/max height for the chunk and dividing by the min/max range (multiplied by the bias, also in (0, 1)). Having the value normalized made it easy to pass it a function in my main Terrain class that could quickly return the actual LOD needed.

Patch Class. To save a few tons of memory, I created a terrain patch class that stores indices to vertices for each of the LODs needed to render a chunk of terrain. This kind of thing has been done before, and there are a number of ways in which to do it. Many of them like to focus on creating full patches that add single-row strips of quads to connect up LODs. In my implementation, to optimize fetching patches by easing up on the number of memcpy calls to fill index buffers, I opted for a larger number of chunks (with some repetitive data) that cover the LOD differences for each level (from 1-maxLevel) for each of four potential neighboring patches at all possible levels (1-maxLevel).

That last sentence was a mouthful — hopefully something visual will help. The image below shows the patches needed for each level with a maxLevel of 16. In each row, every quad except for the last two consists of four patches (for each neighbor — left, right, top and bottom) that connect that level to all the other potential LODs up to maxLevel. In the first row, for example, the first quad — let’s call it qLOD — equals 1, and the LOD for its left, right, top and bottom — we’ll call it nLOD — is also 1. For the second quad in row 1, qLOD = 1 and nLOD = 2, the third quad is qLOD = 1 and nLOD = 4, and so on. Each step is double the last one. So for qLOD = 1, the possible steps up to maxLevel are 1, 2, 4, 8 and 16. For qLOD = 2 (the second row), the possible steps are 2, 4, 8 and 16 — hopefully you get the picture:

Patch Levels

 

 

 

 

 

 

 

 

 

 

 

The last two quads in each row are the “internal partial” and full patches needed to complement the edge patches. An internal partial patch consists of the rest of the quads needed after the edge patches are computed (note that qLOD = 1 doesn’t need it). The full patch is for cases where all neighbors match.

The construction of the patches is clockwise (for OpenGL, GL_FRONT face culling), and built from the middle of a quad to the extents of each side (left, right, top, bottom). It ends up as a single subdivision for each triangle according to nLOD. For example, the first three quads above will produce the following triangles (shown green-filled) for nLOD levels 1, 2 and 4, for the left side:

 

 

 

 

This patching scheme is based on the observation a neighboring chunk with a different LOD only needs to be changed if it’s a higher LOD. Accordingly, when the patches are requested at runtime, care must be taken to request them in the right order with logic that checks neighboring chunks for LODs that are higher.

Below is a snippet from the code that computes the patches. patchCount is the maxLevel; count is the current LOD to compute (qLOD above); NF, NP, NL, NR, NT and NB are the each level (nLOD above), corresponding to full, partial, left, right, top and bottom (again, NF covers the case where a full patch is needed — all neighbors match the current chunk, while NP covers the patch need when one or more edge patches are changed). TerrainPatchBuffer stores an index buffer for each patch — which references an array of vertices 2n+1.
 

/////////////////////////////////////////////////////////////////////////////
// ComputePatch
// -- winding order is clockwise (GL_FRONT)
/////////////////////////////////////////////////////////////////////////////
- (void) ComputePatch:(int)count NF:(int)NF NP:(int)NP
                                 NL:(int)NL NR:(int)NR NT:(int)NT NB:(int)NB
{
    IETerrainPatchBuffer* m = nil;
    int n, x, y, ncount, nsize, max = count - 1,
                                 size = patchCount / count, half = size / 2;
    IE_POINT2D p1, p2, p3, p4, start, center, center2;
    if (NF)
    {
        m = [[IETerrainPatchBuffer alloc] initWithCount:count * count * 6
                                             patchCount:patchCount];
        m.type      = PATCH_F;
        m.typeCount = count;
        for (x = 0; x < count; x++)
        {
            p1.x = x * size;
            p2.x = p1.x + size;
            p3.x = p2.x;
            p4.x = p1.x;
            for (y = 0; y < count; y++)
            {
                p1.y = y * size;
                p2.y = p1.y;
                p3.y = p1.y + size;
                p4.y = p3.y;
                [m AddQuad:p1 p2:p2 p3:p3 p4:p4];
            }
        }
    }
    else if (NP && count != patchCount)
    {
        m = [[IETerrainPatchBuffer alloc] initWithCount:max * max * 6
                                             patchCount:patchCount];
        m.type      = PATCH_P;
        m.typeCount = count;
        for (x = 0; x < max; x++)
        {
            p1.x = half + x * size;
            p2.x = p1.x + size;
            p3.x = p2.x;
            p4.x = p1.x;
            for (y = 0; y < max; y++)
            {
                p1.y = half + y * size;
                p2.y = p1.y;
                p3.y = p1.y + size;
                p4.y = p3.y;
                [m AddQuad:p1 p2:p2 p3:p3 p4:p4];
            }
        }
    }
    else if (NL >= count && count != patchCount)
    {
        ncount      = NL / count;
        nsize       = patchCount / NL;
        int tcount  = (count * ncount + (count - 1)) * 3;
        m           = [[IETerrainPatchBuffer alloc] initWithCount:tcount
                                                       patchCount:patchCount];
        m.type      = PATCH_L;
        m.typeCount = NL;
        p1.x        = 0;
        p2.x        = 0;
        center.x    = half;
        start.x     = 0;
        for (y = 0; y &lt; count; y++)
        {
            center.y = y * size + half;
            start.y  = y * size;
            for (n = 0; n < ncount; n++)
            {
                p1.y = start.y + (n * nsize);
                p2.y = p1.y + nsize;
                [m AddTriangle:center p2:p2 p3:p1];
            }
            if (y < max)
            {
                center2.x = center.x;
                center2.y = (y + 1) * size + half;
                [m AddTriangle:center p2:center2 p3:p2];
            }
        }
    }
    else if (NR >= count && count != patchCount)
    {
        ncount      = NR / count;
        nsize       = patchCount / NR;
        int tcount  = (count * ncount + (count - 1)) * 3;
        m           = [[IETerrainPatchBuffer alloc] initWithCount:tcount
                                                       patchCount:patchCount];
        m.type      = PATCH_R;
        m.typeCount = NR;
        p1.x        = patchCount;
        p2.x        = patchCount;
        center.x    = patchCount - half;
        start.x     = patchCount;
        for (y = 0; y < count; y++)
        {
            center.y = y * size + half;
            start.y  = y * size;
            for (n = 0; n < ncount; n++)
            {
                p1.y = start.y + (n * nsize);
                p2.y = p1.y + nsize;
                [m AddTriangle:center p2:p1 p3:p2];
            }
            if (y < max)
            {
                center2.x = center.x;
                center2.y = (y + 1) * size + half;
                [m AddTriangle:center p2:p2 p3:center2];
            }
        }
    }
    else if (NT >= count && count != patchCount)
    {
        ncount      = NT / count;
        nsize       = patchCount / NT;
        int tcount  = (count * ncount + (count - 1)) * 3;
        m           = [[IETerrainPatchBuffer alloc] initWithCount:tcount
                                                       patchCount:patchCount];
        m.type      = PATCH_T;
        m.typeCount = NT;
        p1.y        = 0;
        p2.y        = 0;
        center.y    = half;
        start.y     = 0;
        for (x = 0; x < count; x++)
        {
            center.x = x * size + half;
            start.x  = x * size;
            for (n = 0; n < ncount; n++)
            {
                p1.x = start.x + (n * nsize);
                p2.x = p1.x + nsize;
                [m AddTriangle:center p2:p1 p3:p2];
            }
            if (x < max)
            {
                center2.x = (x + 1) * size + half;
                center2.y = center.y;
                [m AddTriangle:center p2:p2 p3:center2];
            }
        }
    }
    else if (NB >= count && count != patchCount)
    {
        ncount      = NB / count;
        nsize       = patchCount / NB;
        int tcount  = (count * ncount + (count - 1)) * 3;
        m           = [[IETerrainPatchBuffer alloc] initWithCount:tcount
                                                       patchCount:patchCount];
        m.type      = PATCH_B;
        m.typeCount = NB;
        p1.y        = patchCount;
        p2.y        = patchCount;
        center.y    = patchCount - half;
        start.y     = patchCount;
        for (x = 0; x < count; x++)
        {
            center.x = x * size + half;
            start.x  = x * size;
            for (n = 0; n < ncount; n++)
            {
                p1.x = start.x + (n * nsize);
                p2.x = p1.x + nsize;
                [m AddTriangle:center p2:p2 p3:p1];
            }
            if (x < max)
            {
                center2.x = (x + 1) * size + half;
                center2.y = center.y;
                [m AddTriangle:center p2:center2 p3:p2];
            }
        }
    }
    if (m)
    {
        m.count = count;
        [patches addObject:m], [m release];
    }
}

 

Terrain Nodes. The last important piece is a simple node class for each chunk of terrain. It does the following:

  1. Store the position of the chunk — its (x, z) grid position on the terrain
  2. Compute and store the static LOD for the chunk
  3. Store the heightfield values from the main raw map data and copy them to the main vertex array (in the main terrain class) each frame
  4. Manage Update and Render functions called from a quadtree in the main terrain class — a node is the leaf in the quadtree

Another class manages the branches for the quadtree and contains pointers to its node leafs. The main sauce in the node class, besides the static LOD calculation, is maintaining pointer to patches and rendering them. The following code does the updating. parent is the main terrain manager class, patch is the pre-computed patch pointers based on the LOD of the node. Zones are the neighboring nodes to the node, and parentZone is the branch to which the node belongs — important for updating the parent zone’s overall bounds for all the nodes it contains.
 

- (void) UpdatePatchLevels
{
    if ([self NoGenerateNeeded]) return;
    if (level == parent.maxLevel || [self NeighborsMatch])
    {
        patches[PATCH_F] = [patch GetBuffer:level type:PATCH_F];
    }
    else
    {
        for (int i = 4; i--;)
        {
            if (zones[i]->level > level)
               patches[i] = [patch GetBuffer:level type:i typeCount:zones[i]->level];
            else
               patches[i] = [patch GetBuffer:level type:i typeCount:level];
        }
        patches[PATCH_P] = [patch GetBuffer:level type:PATCH_P];
        patches[PATCH_F] = nil;
    }
    int mapRealSize     = parent.mapRealSize;
    GLushort* mapData   = parent.mapData;
    int x               = pos.x;
    int y               = pos.y;
    IE_VEC3* v          = &patch->verts[0];
    for (int i = 0; i < numHeights; i++, v++)
    {
        heights[i] = mapData[(int)(x + v->x) + (int)(y + v->z) * mapRealSize];
        if (heights[i] > max.y) max.y = heights[i];
        if (heights[i] < min.y) min.y = heights[i];
    }
    cen.y = (min.y + max.y) * 0.5;
    [parentZone UpdateBounds:self];
}

 

Once updated, a node simply renders its patches, as shown below. Note that the copy to heights is updating a single heightfield array (with the heightfield values for a node’s chunk, per frame) that, in this case, has already been enabled/sent to OpenGL as a single attribute:
 

glEnableVertexAttribArray(ATTRIB_HEIGHT);
glVertexAttribPointer(ATTRIB_HEIGHT, 1, GL_FLOAT, GL_FALSE, sizeof(float), &heights[0]);

 

Normally the “height” — the y value of the vertex — would go in as member of a vertex struct. A separate attribute here is for convenience but appears to have no real impact on performance. Also note that in the first line, the shader is getting the node’s (x, z) position, which is added to each vertex position in the shader — again, no significant impact on performance.
 

- (void) Render:(IETerrainNode*)node
{
    [parent->shader SetVec2:IEU_NODE_POS vector:node.pos];
    memcpy(heights, node->heights, numVerts * sizeof(float));
    int numIndices = 0;
    if (node->patches[PATCH_F])
    {
        numIndices = node->patches[PATCH_F].numIndices;
        glDrawElements(GL_TRIANGLES, numIndices, GL_UNSIGNED_SHORT,
                           node->patches[PATCH_F].indices);
    }
    else
    {
        for (int i = 0; i < PATCH_F; i++)
        {
            if (node->patches[i])
            {
                memcpy(&renderIndices[numIndices], node->patches[i]->indices,
                             node->patches[i].numIndices * sizeof(GLushort));
                numIndices += node->patches[i].numIndices;
            }
        }
        if (numIndices)
        {
            glDrawElements(GL_TRIANGLES, numIndices,
                         GL_UNSIGNED_SHORT, renderIndices);
        }
    }
    numTriangles += numIndices / 3;
}

 

In practice both the iPhone 4 and 3GS can handle a reasonable number of Update and Render calls and still produce solid interactive performance rates (30 fps and higher). For 1024 nodes (chunks), for example, after frustum culling, we’re typically left with a few hundred visible nodes, averaging, say, four memcpy calls per patch (most patches are not full). Adding dynamic LOD can cut the size of the draw calls to some extent, however the savings is not quite significant for all the extra mess that’s needed to manage popping/artifacts (I’ll try to cover this in a future post). Additional occulusion culling — horizon culling for example — will have a greater positive impact on performance.