Saturday, July 22, 2017

Hey You, Pikachu!


"Hey You, Pikachu!" is a funny game in 'virtual pet' genre. It can run with emulators, but you need to have original N64 hardware, which comes with that game: Voice Recognition Unit (VRU) and microphone. Also you need an adapter to attach that device to PC. You can't control your pet without that device. However, you can load the game with emulator without VRU attached. When you run the game with GLideN64 in HLE mode, you noticed various graphics glitches:

olivieryuyu has analyzed microcode for this game and found, that it is custom microcode named F3DAM. It is modification of standard F3DEX2 microcode. Besides voice-recognition specific code, it has modifications related to texture coordinates calculation and fog calculation. These differences causes the issues you can see on screen shots above. olivieryuyu decoded these modifications, I implemented them. All microcode-specific problems  gone:

If you don't have VRU device, but want to see how GLideN64 emulates this game: Daniel Eck made nice two hours translation of gameplay on twitch:

If you want to support my work:

Tuesday, July 4, 2017

"Star Wars - Rogue Squadron" crowdfunding campaign.


Today top news is start of "Star Wars - Rogue Squadron" crowdfunding campaign on Indiegogo:


Few months ago olivieryuyu and me started to work on decoding and HLE implementation of this game ucode. We spent many time and we got several good results. The task is really hard. I need your support and encouragement to complete it. This demo video shows current state of the project:

We want it to run fast and look as good as in LLE mode or better.

I have a request for GLideN64 users: I don't have accounts in social networks. Please help me to spread information about this campaign.

Update: The campaign reached the goal. Currently $625 USD raised by 25 backers. Thanks to all backers for the support! Alpha-build of the project is sent to all backers. Since the campaign is already successful, I'm continuing to work on the task. I hope, next alpha will show much more graphics.

Update 2: We just finished implementation of microcode command, which generates all terrain polygons in that game. Alpha build is sent to backers. Demo video:


Friday, June 30, 2017

Acclaim custom lighting.

There are four N64 games, which have the same issue in HLE mode: highlighting of some objects or areas is completely missing. Some area should be highlighted as result of explosion or shot from energy weapon, but nothing happens:

 The effect works ok in LLE

The games are: Armorines - Project S.W.A.R.M., South Park, Turok 2 - Seeds of Evil, Turok 3 - Shadow of Oblivion. All these games were released by Acclaim Entertainment Inc. This is suspicious coincidence. We found that all four games use the same ucode. String id of the ucode claimed that it is standard modification of F3DEX2. Analysis of lighting related commands showed that lighting method used by this ucode is not standard at all, but we did not find any documents, which could explain how it works.

The only way in this case is reverse engineering of ucode's assembler code. olivieryuyu, after success with decoding T3DUX ucode, decided to solve that mystery. He found, that the lighting part of the ucode has custom code indeed. That custom code activated only in special places in games, exactly where highlighting effect is missing.

Standard N64 lighting uses directional and ambient lights. Directional light has direction (vector with 8bit coordinates) and color. Ambient light has only color. Vertex color calculated as sum of colors of directional lights multiplied by light intensity plus color of ambient light. Light intensity depends on angle between light direction and normal to surface, which is kept in vertex.

Custom lighting method, which I called Acclaim lighting, works absolutely differently. Light structure contains position of light source in space (three 16bit coordinates), tree additional 16bit parameters and light's color. 16 bytes in total. Standard light structure has 12 bytes. Eight 16-bytes light structures loaded once at the beginning of display list, when highlight effect used. At first sight these structures have no relation to further rendering process. Game objects use the same vertices, which have the same colors. Lighting bit in geometry mode is switched off. Standard vertex processing method works as if no lighting is used, thus no highlighting effect.

olivieryuyu found geometry mode bit, which activates Acclaim lighting and decoded calculations used by this method. How it works:
  • For each light source calculate vector from light source position to vertex.
  • Calculate sum of absolute values of vector's x y and z coordinates.
  • If this sum is greater than some parameter (say A) in the light source structure, this light is ignored.
  • Light intensity is calculated as abs(sum - A) * B, where B is another parameter in the light source structure.
  • Light color is multiplied by light intensity and added to vertex color.
  • Final result is clamped to 1.
Thus, vertex color brightness can be increased, depending on vertex position. The algorithm looks like an approximation of point lighting. Standard point lighting uses length of vector from vertex to light source to calculate light intensity. Vector length is square root of sum of squares of vector coordinates. This method uses plain sum of vector coordinates.

I implemented Acclaim lighting in GLideN64. The problem is finally solved.

Side by side comparison video

If you want to support my work:

Friday, June 16, 2017

Toukon Road 1 & 2, Last Legion UX: HLE implementation


As you know, there are several games, which does not work in HLE mode. Some games have major glitches, some does not work at all. These games use custom microcodes. We have no information about these microcodes and it is very unlikely that such information will appear someday. We still can run any game in LLE, but HLE is obviously faster. Thus, attempts to decode custom microcodes and improve quality of HLE emulation continue. The only way to do it is to analyse assembler code and try to understand what it does. It is very hard task, which only few people in the world can do (not me), so progress is slow.

olivieryuyu, the main beta tester of Glide64 and GLideN64, decided to take decoding task and already achieved great results. Recently he decoded microcode, which is used by Toukon Road 1 & 2 and Last Legion UX games. You can read details about it on wiki page:


Cite: "Last Legion UX, Shin Nihon Pro Wrestling Toukon Road - Brave Spirits and Shin Nihon Pro Wrestling Toukon Road 2 - The Next Generation uses a undocumented Nintendo ucode called T3DUX.
Shin Nihon Pro Wrestling Toukon Road - Brave Spirits uses the version 0.83 and the two other games 0.85.
It is an evolution of the turbo3d microcode which is used only by one game in its original format, Dark Rift.
The major change in T3DUX compared to turbo3d is what we can called a colors & texture coordinates state."

From my side, I wrote HLE implementation of that ucode. Screen shots:

Last Legion UX ingame
Toukon Road intro
Toukon Road 2 intro

If you want to support my work:

Sunday, April 2, 2017

Resident Evil 2

Resident Evil 2 for Nintendo 64 is hard to emulate game. While the game uses standard ucode (or slight modification of standard one), it uses few non-standard tricks, which are hard to reproduce on PC hardware. I spent lots of time on this game when I worked on Glide64 plugin. Abilities of 3dfx graphics card allowed me to obtain pretty good result: the game was fully playable on Voodoo4/5 with some minor glitches. Later necessary functionality was added to glide wrapper, so you can run the game on any modern PC card.

What makes the game hard to emulate? As you know, the game consists of static 2D backgrounds with 3D models moving over. Background size may vary from place to place: someplace it is 436x384, someplace 448x328 and so on. Frame buffer size corresponds to background size. Video interface stretches image to TV resolution 640x480.

The first problem, which hardware plugin faces in this game is the way how background loaded to frame buffer. To optimize background load and rendering on N64 hardware, background loaded as image with width 512. That is 448x328 image is loaded as 512x287. The game allocates color buffer with width 512 and renders background with BgCopy command into it. In fact BgCopy works as memcpy to copy background content from one address in RDRAM to another. When buffer copy completed, the game allocates buffer with the same origin, but with width 448. Now buffer has correct proportions, and 3D models can be rendered over.

Why it is a problem for hardware graphics plugin? The plugin executes BgCopy command, which loads 512x287 image. It is no problem to create 512x287 texture and render it to frame buffer. The result will look like this:

If the background rendered right to frame buffer, that result can't be fixed. If frame buffer object is used for rendering, you may try to change size of buffer texture the same way as N64 changes size of color buffer. I did not find a way to change size of existing texture without loosing its content with OpenGL. glTexImage2D can change the size/format for existing texture object, but it removes all previous pixel data. Of course, it is possible to copy texture data to conventional memory, resize texture and write the data back, but it will be slow. If you know better method, please share.

There is fast solution of the problem: a hack. Value of video interface register VI_WIDTH is the same as actual width of background image. Thus, we can recalculate background image dimensions and load it properly:

I used that hack in Glide64 and I still don't know better solution. Unfortunately, it works only for HLE, because BgCopy is high-level command. For LLE we still need somehow resize buffer texture.

The next problem is depth compare. I already described the problem here and here, so I cite myself:
"Few games use scenes consisting of 3D models moving over 2D background. Some of objects on the background can be visually "closer" to user than 3D model, that is part of the 3D model is "behind" that object and that part must not be drawn. For fully 3D scene problem "object behind other object" is usually solved by depth buffer. 2D background has no depth, and depth buffer by itself can't help. Zelda OOT solves that problem by rendering auxiliary 3D scene with plain shaded polygonal objects, corresponding to the objects on the background. Thus, the scene gets correct depth buffer. Then the background covers this scene and 3D models rendered over the background are cut by depth buffer when the models are behind the original polygonal objects.
In Resident Evil 2 all screens are 3D models over 2D backgrounds. But the game does not render auxiliary 3D geometry to make depth buffer. Instead, the game ROM contains pre-rendered depth buffer data for each background. That depth buffer data is copied into RDRAM and each frame it is rendered as 16bit texture into a color buffer which then is used as the depth buffer. To emulate it on PC hardware the depth buffer data must be converted into format of PC depth buffer and copied into PC card depth buffer."

Glide64 was the first plugin, where the problem was solved. Copy values to depth buffer was relatively easy with glide3x API: glide3x depth buffer format is 16bit integer, as for N64. I could load depth image as 16bit RGB texture, render it to a texture buffer and then use that buffer as depth buffer, exactly as N64 does. OpenGL could not do it, but glide wrapper authors also manged to solve that problem. It was kinda hackish, but it works.

GLideN64 uses another solution. I invented it for NFL Quarterback Club 98 TV monitor effect. It is described in details in my Depth buffer emulation II article. Depth image is loaded as texture with one component RED and texel format of GL_UNSIGNED_SHORT. When the texture is rendered, fragment shader stores fetched texel as its depth value. Depth value from fragment shader passed to depth buffer, exactly as we need.

So, we have color background and depth buffer correctly rendered. Victory? Not yet. Depth buffer compare works, but not always. Here it works ok:

but if I step behind it looks like this:

Where is the problem? The problem is in the way N64 depth buffer works. N64 vertex uses 18bit fixed point depth value. N64 depth buffer stores 16 bit elements. N64 uses non-linear transformation of 18bit vertex depth value to 16bit value, which will be used for depth compare and then kept in the depth buffer. OpenGL uses floats for vertex depth and for depth buffer, but it is incorrect to directly compare GL depth component with value from N64 depth image. First, the same transformation must be applied to vertex depth. Fortunately, necessary shader code was already written for depth based fog, which Beetle Adventure Racing uses. I reused that code and finally got perfect result:

If you want to support my work:

Saturday, April 1, 2017

Major modification of frame buffer and video interface emulation.

I already wrote about N64 Video Interface emulation in GLideN64. It was my first attempt to make things right. Three years passed. Many elements of frame buffer emulation mechanism have been modified since that time. However, one major problem remained. This problem is as old as N64 emulation itself. This is "frame buffer height" problem.

To render anything you first need to allocate rectangular buffer, which will hold your graphics. You need to know width and height to allocate the buffer. The problem is that RDP command SetColorImage set only width of color buffer. Height is not set. RDP does not need to know buffer height. SetColorImage provides buffer origin, number of pixels per line and size of each pixel in bytes. This is enough to calculate position of vertex with given X and Y coordinates within the buffer. Scissor command prevents out of buffer writes. Software graphics plugin works exactly as RDP and also does not need to know buffer height. Hardware plugin is in trouble. Suppose, we selected 960x720 resolution with 4:3 aspect ratio and created 960x720 render buffer in video memory. N64 game allocates buffer with width 320. Which scale should we apply to original N64 coordinates to get correct picture in our render buffer? Since 960 = 3 x 320, it seems that correct scale is 3x. That is we scale original N64 X and Y coordinates by 3 and get picture in our buffer. Will this picture be correct? Only if original buffer also has 4:3 aspect, that is has size 320x240. In reality, it also can be 320x220, 320x256 or even 320x480. In all these case 3x scale for Y will give us wrong result. To get correct Y scale we need to know height of original buffer, but it is not available.

Height of N64 render buffer can be estimated from parameters of Video Interface, which defines how color buffer will be mapped to TV screen. All hardware plugins, which I know from inside use this possibility. Thus, frame buffer allocation becomes dependent on VI registers. This dependency does not exist in N64 itself. The height estimation does not guarantee to be always correct, and in fact it is often incorrect. The estimation code is complex and full of heuristics, to reduce numbers of errors. Nevertheless, this tie still induce many issues, in particular with PAL games and with games, which use interlaced TV modes.

Besides main color buffers, whose content is displayed on TV, N64 games often use auxiliary color buffers. These buffers are used for variety of purposes: dynamic shadows, reflections, TV monitors and so on. Auxiliary color buffer can be of any size. Thus, estimation of auxiliary buffer height is complex and fully heuristic algorithm, which also not always works right. Wrong height lead to visual glitches.

At the end of 2016 I finally invented the way to get rid of necessity to know exact height of  N64 color buffers. The idea is actually very simple. Why RDP does not care about buffer height? It knows that the height is large enough and just fills the buffer with primitives. Video Interface takes necessary part of the buffer and maps it on TV screen. Auxiliary buffers are used as textures: game's program code knows buffer's bounds and maps texture coordinates to its content.
My frame buffer mechanism creates separate frame buffer object in video memory for each buffer allocated by RDP. I used estimated height to create the buffer render target. It caused aforementioned issues when estimation heuristics failed and produced wrong result. So, the idea is to not use estimated buffer height and always use large enough height instead. 'Large enough' should be taken literally. It is some value, which is surely greater or equal to any possible height of N64 buffer. There are some natural limitations: maximal buffer size for NTSC is 640x480 and 640x576 for PAL.
Since I know width of rendering resolution selected by user and I know width of N64 rendering buffer - I know how to scale original coordinates of N64 vertices. This scale can be applied for X and Y coordinate, no matter has the N64 buffer the same aspect as user selected screen resolution or not. Video Interface emulation will map my frame buffer object to screen the same way as N64 Video Interface maps N64 buffer in RDRAM to TV screen.


  • No more buffer height estimation heuristics.
  • No more glitches caused by wrong height estimation
  • Emulation of effects, not working before
  • More video memory needed. Memory overhead is not large for main buffers, because actual buffer height is usually close to natural limit used as Large Enough Height. Memory allocated for auxiliary can be 10 times more than actually used.

While the idea is simple, its implementation was not.  It was obvious, that lots of things need to be changed. The first step was code refactoring, mentioned in the previous article. After that step I got more clear and easy to modify code. It was not enough though. Some preliminary steps had to be done first.

There is one OpenGL specific problem with emulation of N64 graphics. N64 uses coordinate system with origin in upper left corner. Glide3X API allowed to set origin to either upper left or to lower left. So, when I worked on Glide64, I set origin to upper left and had no inconveniences. OpenGL has origin nailed to lower left corner. If you will use N64 coordinates, you will get image upside down. Thus, Y coordinate must be inverted. (0,0) coordinate translated to (0, maxY), where maxY is buffer's height.

It is simple trick, but you need to apply it everywhere: modify vertex Y, viewport Y, scissor Y. Read from frame buffer to RDRAM have to be done in reverse order. Things could get even more complicated with new frame buffer technique. Thus, I decided to remove Y inversion. Of course, image will be upside down in that case.

However, the image is in frame buffer texture, which I can map to screen as I need. So, it is not a problem. The problem arises when you do not use frame buffer object and do rendering right to back buffer. GLideN64 renders right to screen when frame buffer emulation disabled. I did not want to keep Y inversion code to support "no frame buffer emulation" mode. My goal was to simplify things, not to make them more complex and intricate. Thus, I decided to slightly modify "no frame buffer emulation" mode: use one frame buffer object for rendering instead of direct render to back buffer.  It also mentioned in previous article: "Anti aliasing without frame buffer emulation". After that modification I could safely remove Y inversion code.

After preliminary work completed, real challenge started. Implementation of my idea was a very hard task.  Frame buffer emulation was twisted tight with VI emulation, and I spent many time untangling multiple knots and fixing weirdest glitches. At the end I was totally rewarded. Issues with cut image in PAL games gone. Issues with screen shakes in interlaced mode gone. Many crashes with buffer copy to RDRAM gone. VI effects started to work more smooth. Screen shrink VI effect in Mia Hamm Soccer finally start to work properly.

Saturday, March 25, 2017

Project news


Three month passed since the latest Public Release. Time to report about most noticeable changes.

Massive code refactoring

GLideN64 currently supports the following graphics API: OpenGL, GLES2, GLES3, GLES3.1
OpenGL support also divided on GL 3.3 and GL 4.3+. API functions called directly from any place in code. It causes the following problems:

  • The code contains lots of GL version - specific code, separated by #ifdef (or if() for OpenGL versions)
  • Android emulator distributes 4 GLideN64 binaries for each supported CPU family. 

I refactored GLideN64 code to totally remove direct calls to graphics API from main code. All core GLideN64 classes use special proxy class graphics::Context to manipulate with textures , shaders, buffers and so on, and to draw objects. Context passes calls to back-end class. Currently there is one back-end, which uses OpenGL. If somebody wants to add Vulkan API or DirectX API support, it can be made much easier now: just write new back-end.

OpenGL backend designed as dynamically adoptable for available GL version. It may use different functions for the same task. For example, if available GL supports glTexStorage2D the new texture will be initialized with glTexStorage2D and with glTexImage2D otherwise.

Another example is polygons drawing. Core OpenGL 3.3 requires to pass vertex data from application to GL via Vertex Buffer Object (VBO). GLideN64 used immediate mode rendering with data stored in client side arrays. Thus, it could not use core profile. New back-end implements primitives drawer, which uses VBO and supports core profile. However, we found that many Android devices work better with old immediate mode rendering. So, the back-end also has primitives drawer, which uses immediate mode. Back-end decides which drawer to use in run-time and does it transparently to main code.

The amount of code changes was huge. I totally rewrote many parts of code. As the result, the code is much more clean now. Logan McNaughton and Francisco Zurita helped me to tune the back-end and select most effective GL functions for each GL version. In most cases refactored code works as fast or better than before refactoring. Android port now uses only one binary for all versions of GL ES.

VSync support

GLideN64 version for Zilmar-spec emulators does not support vertical sync. I thought that necessity in that option gone with analog monitors. However, users asked me to add it, because they experienced tearing on their monitors without it. VSync is not part of OpenGL specifications. Use of WGL extensions required to enable it on Windows. I haven't time for it until recently. The code refactoring made it possible to use OpenGL core profile on Windows. Core profile also requires use of WGL extensions. I made necessary changes. Adding VSync support was a matter of few lines after that. Ryan Rosser added new control to the GUI.

MacOsX port

While GLideN64 successfully works on Linux and Android, Mac port was impossible until now. Mac OpenGL driver requires from application to use core GL profile if it needs to work with OpenGL 3.3 or above. That implies VBO support. GLideN64 did not support VBO until the refactoring. I made new attempt to port the code to MacOsX after refactoring completed. This attempt was successful:

I don't know how well the port works: I have no Mac. I got remote access to Mac mini via command line and just made code compilable. The video provided by Brent Woodruff, who built plugin from sources and run on his Mac.

Anti aliasing without frame buffer emulation

Frame buffer emulation is enabled by default. It can be disabled, but it also disables many features, including anti aliasing and gamma correction. This is because anti aliasing and gamma correction requires rendering with Frame Buffer Objects (FBO), which is enabled only with frame buffer emulation. I changed it: now plugin always uses FBO for rendering. This made possible to use anti aliasing and gamma correction even when frame buffer emulation disabled. This was done as preliminary step for another large code refactoring, which I will describe next time.


Donations are welcome. Two options are available: Yandex Money (the form above) and PayPal: https://www.paypal.me/SergeyLipskiy. Both methods work well, my thanks to people, who used them. Also, does anybody know how to place my paypal.me link as widget/gadget on blog layout? I'm helpless in web design. Another side note: it seems that my mail server has problems with sending mails to AOL mailboxes. I tried to say thanks by email, but it probably was not delivered.