I have a homebuilt computer with an Antec Sonata III case and an Intel DX58SO motherboard. I really, really like both the case and the motherboard, but I’ve been struggling with a sound problem in the computer for two years now. It never bothered me enough to think about fixing it until last week, but I finally fixed it tonight and would like to share my story.

The problem goes as follows:

I use the front panel headphone jack a lot on my computer. The sound coming out of that jack was always fine until I plugged a high-speed USB device into one of the two front panel USB ports. As soon as I did that, I would always hear weird “beeps” or “screeches” in my headphones as USB data was being transferred. The problem was especially noticeable with things like external USB flash drives, my iPhone, and my AVR ISP programmer, but not with lower-speed devices like USB gamepads. It only affected the combination of the front headphones and the front USB ports — devices plugged into the rear USB ports would not affect the headphones, and devices plugged into the front USB ports did not affect the rear speaker jack.

In some ways, it was actually kind of cool to be able to hear USB data transfers, but it would get annoying when I was trying to listen to music. I did a lot of Googling about this problem, and it turns out that it’s a common problem with some of Antec’s cases (and cases by other manufacturers too). The connector you plug into your motherboard for the front USB ports has a ground pin on it. The connector you plug into your motherboard for the front audio ports also has a ground pin on it. So your motherboard provides a ground for each of these connectors. The problem is that the front port assembly on my case connected the ground pins for the USB ports and the audio ports together on its end. This created a ground loop. My understanding is that because the USB and audio connectors provide their own separate grounds through different wires, they should not be connected together at the front panel because if the two grounds differ slightly in voltage (which they can, because wires and PCB traces do have a [small] resistance), current will flow between them. This current flow manifests itself as weird sounds on my headphone jack. (I’m really not an electrical engineering expert, so if I’m incorrect or lacking in my explanation, someone please let me know in the comments below).

Here are some forum/blog postings by other people who encountered this same problem on other case models:

I was able to verify this by using my multimeter in “continuity test” mode. After disconnecting all the case’s wires to the motherboard, I tested the continuity between the high definition audio (HDA) connector’s ground pin and the USB connector’s ground pin. It showed continuity, so that proved that the USB and audio connectors’ grounds were hooked together somewhere inside the front panel module.

Before I did any of these diagnostics, I was writing back and forth with Antec’s customer support. I must say, Antec has excellent customer service. The people I’ve dealt with are friendly and knowledgeable. Antec customer support told me that they have this problem fixed in a newer version of the front port assembly. Meanwhile, I was waiting for it to arrive, so I decided to play with my existing front port module to see if I could fix it myself, knowing that even if I failed, a new one was on the way. With nothing to lose, why not go for it?

First, I took the air filter out (it slides out of the bottom of the front of the case). Next, I removed the front of the case. This was slightly tricky — I removed all 5.25″ devices (just my DVD burner in this case) and the external 3.5″ drive bay. With those out of the way I was able to pull off the front of the case by releasing the 6 tabs holding it on (3 on each side of the case — top, middle, and bottom). The front panel module also had a wire screwed to the chassis which I had to disconnect (interestingly, this wire is only attached to the front panel eSATA port — it has no connection whatsoever to the audio or USB ports — good thing because that would probably cause another ground loop!)

Here’s the front of the case, completely removed:

The front port assembly was attached to the front of the case by 3 screws — I removed these and then it was free to go:

Unfortunately, the front port assembly is sealed–it’s glued together or something along those lines. It’s not held together with screws or anything like that, so it’s kind of a pain to open up. It was no match for my X-Acto knife though! I was able to cut around where I could see the two plastic halves meet, and with the help of a small screwdriver, pried it apart. Amazingly, the two halves of the plastic casing stayed intact! The culprit immediately jumped out at me: there was a white wire coming out of the sealed audio assembly and soldered to the shield of one of the USB ports. I checked it with my continuity tester to be sure, but I already knew it had to be that wire, because it was the only wire that connected the USB ports to the audio ports. (This picture was taken after I had hacked it, but I tried to recreate what it looked like at first)

I grabbed my soldering iron and desoldered the white wire from the USB port:

Before doing anything else, I plugged the USB ports and audio jacks back into the motherboard, booted it up, and tested to see if the interference was still there. It was GONE! In fact, I was able to make the interference reappear by momentarily touching the white wire back to the USB port. My multimeter confirmed that disconnecting the white wire had indeed separated the audio and USB grounds from each other. I ended up cutting off most of the white wire and covering the remaining stub with electrical tape just to be safe. Somehow, even after my hack job with the X-Acto knife, the two halves of the plastic case for the front port assembly snapped back together (and stayed that way), so I didn’t have to reseal it. I put the case back together, and the audio output from my headphone jack is perfect now. I can’t hear any interference at all anymore, even if I turn my sound up all the way and plug in both my AVR programmer and my iPhone to the front USB ports.

I wish I had fixed that one a long time ago!

Epilogue:

The replacement front port assembly from Antec arrived, and it also fixes the sound problem. If you don’t want to hack apart your front assembly, get ahold of Antec customer support. They are great! I would definitely recommend getting a replacement assembly from Antec instead of hacking apart your existing one. For anyone interested, I checked out the new assembly with my multimeter, and it correctly has the USB and audio grounds separated now. The new revision also makes another change I thought I’d mention: the USB ports are also grounded to the chassis now (only the eSATA port was before — now both USB ports and the eSATA port are). So if you insist on doing it yourself, it might be wise to take that white wire you cut and use it to connect the USB and eSATA grounds together, so they will all be grounded to the chassis.

The easiest way would probably be to leave the white wire soldered to the USB port, cut it off on the audio port end, shorten the wire, and solder the other end to where the chassis ground wire connects to the little eSATA PCB. Here’s an illustration of what I mean:

Hi again everybody! Once again, I let a bunch of time elapse before writing another article in my microcontroller programming series. I left off last time by mentioning that most microcontrollers have a built-in SPI peripheral that handles the SPI communication protocol for you. Today, I’m going to talk about how such a peripheral would work. I’m going to do it differently from I have done in the past, though. This time, instead of making up a theoretical peripheral, I’m going to actually walk through an actual peripheral built-in to a real microcontroller! I’m going to be using the AVR ATmega328P, which is also the chip used in the Arduino Uno. Note: I don’t actually have an Arduino Uno (nor anything else with an ATmega328P for that matter), but I figured I would review the peripheral and talk about how to make it work.

We’re going to pretend to talk to some random SPI slave device, and I’ll show you how to write (and read) a byte to it. In this situation, the ATmega328P will be the master device.

As you may (or may not) remember, SPI uses four pins: MISO, MOSI, CLK, and CS. Before we do anything, we need to initialize these four pins as inputs or outputs. Let’s take a second to figure out how the pins need to be configured. MISO is master in, slave out. This means it will be an input on the master, since data is coming from the slave to the master. MOSI is the opposite, so it will be an output. CLK and CS are both controlled by the master, so they will also be outputs. So the first line of business is to set MISO as an input and the other three pins as outputs.

On the AVR we’re using, port B is where the SPI functions are located. On port B, pin 2 is chip select (they call it slave select, but it means the same thing), pin 3 is MOSI, pin 4 is MISO, and pin 5 is CLK. So using the AVR’s data direction register, let’s make sure that port B, pin 4 is configured as an input, while port B, pins 2, 3, and 5 are outputs.

DDRB |= ((1 << 2) | (1 << 3) | (1 << 5));
DDRB &= ~(1 << 4);

This will turn on bits 2, 3, and 5 of the port B’s DDR register, and turn off bit 4.

Now, before we do any more register reading/writing, let’s look at the SPI registers available in the ATmega328P’s datasheet. This is more or less going to be the same information available in the datasheet, but I’d like to walk through it to describe which registers are important and which ones are just small configuration things that aren’t really relevant to understanding how the peripheral works.

The first register is SPCR, the SPI control register.

  • Bit 7 is SPIE — the SPI interrupt enable. If this bit is turned on, you will get an interrupt from the SPI peripheral whenever a transmission completes. For now, we don’t want to worry about interrupts, so we will leave this off.
  • Bit 6 is SPE — SPI enable. This bit actually turns on the SPI peripheral, so we will definitely want to turn it on.
  • Bit 5 is DORD — data order. When it’s 1, data is transmitted serially least-significant bit first. 0 means most-significant bit first. What you set here depends on the slave you will be talking to. For our purposes, let’s assume the slave we will be communicating with expects to receive data least-significant bit first, so we will set it to 1.
  • Bit 4 is MSTR — master/slave select. We will be setting this to a 1 to ensure that the AVR will act as a master rather than a slave.
  • Bit 3 is CPOL — clock polarity. This is another one that’s dependent on the slave you’re talking to. Some slaves expect you to keep the clock line high when you’re not communicating to the slave, and others expect you to leave it low. The slave datasheet will tell you which one you’re supposed to use. For our purposes, let’s assume we are supposed to set CPOL to 1.
  • Bit 2 is CPHA — clock phase. This one goes hand-in-hand with CPOL, and is another one that you will have to determine by looking at the slave’s datasheet. It has to do with when data is sampled. We will assume CPHA is supposed to be 0.
    • Quick note: CPOL and CPHA are sometimes treated together as a 2-bit value called the SPI mode. In our case, with CPOL 1 and CPHA 0, we are using SPI mode 2 (binary 10). Some SPI slave datasheets will specify a mode number rather than CPOL or CPHA, or might only show a signal diagram in which you will have to figure out for yourself what CPOL and CPHA are. See the datasheet for more info on this, but it’s not really important for understanding how to use the SPI peripheral.
  • Bits 1 and 0 (SPR1 and SPR0) together determine the SPI clock rate divider. They will allow you to divide the CPU clock by 4, 16, 64, or 128 to determine an SPI clock rate (how fast the CLK pin will toggle from low to high and low again while you are sending data to the slave). The datasheet also mentions that if you set the SPI2X bit of the SPI status register, you can divide the clock by 2, 8, 32, or 64 instead. For our purposes, let’s assume the CPU clock rate is 8 MHz and our SPI peripheral’s maximum clock rate is 500 KHz. Thus, we can set the divider to 16 (so the divider bits will be 01 and SPI2X will be 0), which will give us an exact SPI clock rate of 500 KHz (0.5 MHz).

I’ll also go through the SPI status register (SPSR) really quickly since I have already mentioned it. Usually, status registers are read-only, and they exist to tell you the status of the peripheral. In this case, the SPI2X bit is a special exception.

  • Bit 7 (SPIF) is the SPI interrupt flag, which is just a flag that gets set to 1 whenever a transfer has completed. Even though we’re not using interrupts in this simple example, we will still look at this bit to determine when an SPI transfer is complete.
  • Bit 6 (WCOL) is the write collision flag, which lets you know if you wrote to the data register while a transfer was still in progress (you’re not supposed to do that). I consider this a fairly useless bit because you should ensure your code doesn’t try to write to the data register while a transfer is already in progress.
  • Bit 0 is the SPI2X bit that I mentioned earlier — it changes the clock rate as I mentioned.

There is one more register: the SPI data register (SPDR). This is the register you write to in order to begin an SPI transmission. If you want to send 0x52 over SPI, you would write 0x52 to this register, and then wait for the transmission to complete. Then, you can read from this same SPI data register to see the eight bits that the slave sent back to you while you were sending the 0x52 to it.

That’s it! That’s all there is to the SPI peripheral in the AVR. You’ll notice that it doesn’t provide any options for 16- or 32-bit transmissions, but you can do it yourself by sending 2 or 4 successive 8-bit transmissions. Also, there’s one other thing I should point out. The datasheet says that the SPI interface does not automatically control the SS (chip select) line. So you need to handle it yourself before starting a transmission and after a transmission is complete. Let’s assume that the chip select line should be high when idle, and low when you’re talking to the chip. OK, so let’s write some code!

We have already initialized the port direction registers, so now let’s turn on the SPI peripheral:

SPCR = (1 << SPE) | (1 << DORD) | (1 << MSTR) | (1 << CPOL) | (1 << SPR0);

I went ahead and left out the bits I set to zero, but you could insert them as (0 << BITNAME) if you want it to be completely clear.

We should probably also ensure that the SPI2X bit in the status register is off:

SPSR &= ~(1 << SPI2X);

OK, so now the SPI peripheral is pretty much ready for action! Let’s do one last part of preparation and ensure that the chip select pin is high, which it should be while idle:

PORTB |= (1 << 2);

We don’t have to worry about any of the other pins, because the AVR’s SPI peripheral has taken control of them at this point.

Now, let’s send the 0x52 byte to the slave:

// Pull chip select low to assert it (activating the slave)
PORTB &= ~(1 << 2);

// Send 0x52 to the slave
SPDR = 0x52;

At this point, we have started to send 0x52 to the slave. But the instruction will complete before the peripheral has finished sending the data to the slave. So now, before we do anything else (such as trying to send another byte or reading what the slave sent to us), we need to wait for the transmission to complete.

If we had enabled interrupts and created an interrupt routine, we could go on to doing other stuff and an interrupt would occur as soon as the transmission finished. But since this example is not interrupt-driven, we will now poll the status register until we know that the transmission has completed:

while ((SPSR & (1 << SPIF)) == 0);

This code waits until the SPIF bit of the status register becomes 1. This means that the transmission has completed, so we can now read the 8 bits that the chip sent to us:

uint8_t result = SPDR;

This act of waiting until the SPIF bit becomes set and then reading the SPDR register will clear the SPIF bit (the datasheet for the AVR says so). So next time we begin a transfer, we can do the same thing — wait until the SPIF bit goes high again, and then read the SPDR register.

The last thing we should do, now that we’re done, is pull chip select high to complete the transfer. Some slaves might prefer for chip select to stay low until several bytes have been sent and received, but we will assume that this chip only wants chip select to go low for a single byte and then return high.

PORTB |= (1 << 2);

There you go. You now know how to use the SPI peripheral in an AVR microcontroller. You’ll find that most of the 8-bit AVRs have an SPI controller very similar to this one. You’ll also find that this is basically how the SPI peripheral works in all microcontrollers. The toughest part is getting all of the setup values correct, particularly the CPHA and CPOL values. Once you have that in place, the rest of it is really, really simple.

I didn’t cover handling SPI with interrupts, but it’s not much more difficult than this — in the SPI interrupt handler, you can read back the data and either begin another transfer or pull chip select high to finish the transfer. If you have a big list of bytes that need to be sent as quickly as possible, interrupt-driven SPI would be ideal to take care of that.

That’s all I have for SPI. Tune in next time for a discussion of some other yet-to-be-determined microcontroller peripheral!

I recently had the opportunity to get a Freescale KwikStik Kinetis K40 Cortex-M4 development board (thanks, Newark/Farnell!). I’ve been using the Cortex-M3, so I had been very excited to test out the Cortex-M4. Plus, I have also been anxious to evaluate Freescale’s ARM offerings, so this was the perfect board for that. Sadly, the Cortex-M4 included in the K40X256VLQ100 chip used on the KwikStik does not have the optional floating-point unit, so I wasn’t able to do any performance tests as far as that is concerned.

Anyway, I thought I’d share my experience with the board, so here goes nothing!

It came packaged in a nifty box that opens up to reveal two sides — one with some documentation and a getting started DVD, the other with the Kinetis in its awesome cover and a USB A-to-micro-B cable:

  

Here is the KwikStik, powered up in all its orange glory:

And here it is outside of its case:

 

The DVD contains various PDF datasheets for the Kinetis microcontrollers, evaluation and code-limited versions of several compilers, and Freescale’s MQX RTOS.

Freescale put a ton of interesting things on the KwikStik to play around with:

  • LCD display
  • 3.5 mm audio output jack
  • Microphone
  • Buzzer
  • Onboard USB J-Link programmer for flashing the chip (and to use the board as a programmer for other boards)
  • USB OTG/host/device port for applications
  • Infrared transmitter/receiver
  • microSD card slot (doesn’t totally work in the revision I received — see below)
  • Six capacitive touch buttons
  • Rechargeable MgLi (manganese lithium) coin cell battery

So without further adieu, let’s get started with the fun stuff!

When you power the board up with the default firmware by plugging into either of the two USB ports, a menu comes up with three options, navigable with the capacitive touch buttons. The options are: sound recorder, remote control, and USB joystick.

The sound recorder will record two seconds of audio with the built-in microphone and play the recorded sound back through the 3.5 mm jack. The remote control feature is meant for controlling a Sony TV, and the USB joystick will let you use some of the capacitive touch buttons as a simple HID joystick on a computer.

Unfortunately, I couldn’t get the HID joystick demo to work — it was recognized on the computer as a gamepad, but the buttons wouldn’t register any presses in the Windows 7 game controller settings dialog. I also couldn’t test the Sony TV controller since I have a Vizio TV. The sound recorder works fine though!

The default firmware doesn’t really matter much, though — the fun part is making your own programs to do stuff! To start out, I downloaded CodeWarrior Development Studio for Microcontrollers (Special Edition with 128K code size limit). I had to download it since it didn’t come on the DVD. Installing it was a bit of a pain because I later found out that putting it in the default C:\Program Files (x86) directory will cause problems later on when trying to update it, so I had to uninstall it and put it in a standard unprotected location where administrator access isn’t needed to modify files. Despite that annoyance, I like it because it’s based on Eclipse and I’m used to Eclipse. I also tried IAR, but I didn’t like the user interface (and couldn’t get it to work with the demo firmware anyway). I also downloaded the Segger J-Link software — Segger’s site seemed to imply that my DVD should have come with it, but I couldn’t find it as an option to install in the Flash interface. Luckily, Segger still lets me download it as long as I agree not to use it with illegal cloned boards. So I got that all ironed out!

Despite not caring too much about the default firmware, the first thing I did was download the factory image off the device, just in case. After installing the Segger software, I used JLink.exe to read the flash contents after using JMem.exe to determine where the flashed image ended (where a bunch of endless 0xFFs began). It turns out the firmware binary is 0xD004 bytes long, so I used these commands in JLink.exe:

h
savebin C:\Path\To\DefaultFirmware.bin, 0x0, 0xD004

The “h” command halts the CPU, and the “savebin” command will save 0xD004 bytes of data starting at address 0 to DefaultFirmware.bin, so I can always restore the board back to the exact way I got it. Yeah, I’m paranoid like that. It’s always the first thing I do when I get a microcontroller evaluation board. The way I see it, I’m going to need to know how to use the flashing/debugging tools anyway, so it’s not a waste of time to do that first!

Next, I decided to try to compile Freescale’s example firmware to see if maybe they had worked out some of the kinks in the HID joystick demo. This required updating CodeWarrior to the latest version and installing the latest version of Freescale’s MQX RTOS, which the demo uses. Finally, after trying to compile the example firmware, I ran into some errors because there were a couple of hardcoded “C:\Program Files” paths which I had to change. After all that mumbo jumbo, I was finally able to compile their example firmware which is available for download on the KwikStik page on their website.

Programming the device from inside CodeWarrior worked fine, and once that was done, I ran the demo — they actually had replaced the HID demo with a USB mouse demo, which does work (it lets me left click and right click). They had also fixed a bug that caused the remote control app’s exit button to not work, and they added a Tetris clone game demo.

OK, so at this point I finally had an environment ready for doing some real development! I didn’t really feel like learning how to use Freescale’s RTOS yet, so I just made a simple bare metal demo app to turn on all the LCD’s segments in succession, then turn them off in succession, endlessly.

CodeWarrior comes with some pretty interesting features — the microcontroller appears as a picture in CodeWarrior, and the picture contains a bunch of blocks for all the peripherals built in, including the CPU itself. You can click on them to set up options (such as clocking for the CPU) and it will generate initialization code for you. It made it pretty easy to make a quick app. I borrowed some of the code from the KwikStik demo application for the LCD to understand the options Freescale used for setting it up, and it pretty much worked right out of the box. It turned out that the LCD was visibly flickering in my app, and it was because I had to change the IRCLK source to be the fast internal reference clock (2 MHz) instead of the default slow internal reference clock (32.768 KHz) — it looks like the SLCD’s initialization was using the IRCLK and the default setting was not fast enough, so I saw flickering. With that out of the way, my program worked like a charm!

After all the initialization is complete, here’s the simple main loop code:

int x;
int y;
volatile int c;

for (;;)
{
    for (x = 1; x < 40; x++)
    {
        for (y = 0; y < 8; y++)
        {
            // Turn on the segments one at a time...
            LCD_WF8B(x) |= (1 << y);

            // Wait a bit?
            c = 0x20000;
            while (c--);
        }
    }

    for (x = 1; x < 40; x++)
    {
        for (y = 0; y < 8; y++)
        {
            // Turn off the segments one at a time...
            LCD_WF8B(x) &= ~(1 << y);

            // Wait a bit?
            c = 0x20000;
            while (c--);
        }
    }
}

I like the SLCD peripheral! It makes it easy to turn segments on or off by changing a single bit, and then you’re done. The SLCD peripheral uses up to 8 pins as backplane pins, and the other 40 pins can be used to control 40 segments per backplane. It is very quickly switching between the eight backplanes and turning the segments on and off as necessary. This all happens so fast that you have no idea that the LCD segments are being turned on and off (well, unless you accidentally set the clock rate too low like I did!). With 8 backplane pins and 40 segment pins, you can control up to 320 LCD segments with this peripheral. It would be a big pain in the rear end to implement the LCD refreshing I just described manually, and the peripheral makes it so easy — just set the bit for the segment you want to turn on, and you’re done!

The next thing I should do is get rid of the ugly busy waiting that I have for delays right now, and switch over to using a timer, which would be pretty easy to do.

So yeah, once you get down to making your own stuff, it’s a cool little board with tons of possibilities for interesting applications!

Now, remember how I said the SD card slot doesn’t totally work? Well, it turns out that in board revisions earlier than revision 5 (I received revision 4), Freescale accidentally connected the SD card socket’s data pins to the wrong pins on the microcontroller! It essentially makes the microSD card slot useless because you can’t use the Kinetis’s built-in SDHC controller peripheral for accessing the SD card. You could bit-bang it yourself, but that will have a (very negative) performance impact. Maybe I can hack the board with some cut traces and wires to connect the SD card socket’s pins to the correct microcontroller pins — who knows? Maybe I’ll do THAT as a fun project soon! If I do, I’ll be sure to post about it here!

So, in conclusion, I’m happy with the board, although it seems like Freescale made a really boneheaded mistake (the SD card wiring problem is inexcusable, especially given that it was still present in revision 4 of the board!). Regardless of that, though, I’d still definitely recommend it as a platform for learning about microcontrollers, simply because of all the awesome stuff it has built-in aside from the microSD slot. And seriously, a 100 MHz processor in a microcontroller? I know the Cortex-M3s were up there in clock speed too, but still–what’s the world coming to? With all of that processing power combined with the wide range of peripherals, it provides plenty of awesome opportunities to learn about microcontroller programming. The included orange protective jacket is a nice touch, too!

You can purchase the Kinetis KwikStik from Newark.

Until next time, see ya later!

I haven’t written about my Mac ROM hacking lately, so this post will double as an update to my ROM hacking endeavors and a review of Seeed Studio‘s awesome Fusion PCB service.

After going through all the hoopla to desolder the DIP chips on my Mac IIci and socket them, I finally got frustrated. The main annoyance is that the DIP ROMs are all the way underneath the hard drive and floppy drive carrier. It was forcing me to basically rip everything out of the IIci in order to access the DIP sockets. I also had been in contact with the folks on the 68k Mac Liberation Army forums, and they had a lot of useful information, including the tidbit that many of the Mac II series machines (and the SE/30) have a SIMM socket that you can put a ROM SIMM into. My IIci also has this socket, and it’s easily accessible without removing anything from the case. So, I decided to make my own! Thanks to a lot of advice from fellow forum members, I was able to lay out a SIMM printed circuit board to have manufactured.

I ran into a problem: the SIMM needs to be about 0.047″ thick in order to physically fit in the socket. Most of the inexpensive circuit board prototyping manufacturers make boards that are 0.063″ thick. Luckily, bigmessowires from the 68k Mac Liberation Forums was able to recommend a PCB manufacturer that could make the boards in the thickness I needed: Seeed Studio Fusion PCB. He had used them in the past with good results, so I went for it.

Seeed Studio has amazing pricing. My board was about 3.85″ by 1.1″. That fits within a 10 cm by 5 cm rectangle. As of this writing, they can manufacture ten 2-layer boards of that size for a total of $24.90 plus shipping. That includes solder mask and silkscreen on both sides. It’s an incredible price! They also offer many thickness options, including 1.2 mm, which works out to 0.047″. I asked for red solder mask instead of the standard green, which added an additional $10 to the price. They have extremely good trace width/spacing requirements: 6 mil (0.006″) trace width and 6 mil spacing. I went ahead and made my minimum trace width and spacing 8 mils, just to be safe.

After you place your order online, you e-mail them a zip file containing the Gerber files for your PCB. If there are any problems, they will let you know — otherwise, you just have to wait for the boards to be manufactured and shipped.

Shipping was only $3.52 for registered air mail to the United States from Hong Kong. They had other options, too, but this was the least expensive one. They take PayPal as a payment option, and I can’t remember the other options they offered. I placed my order on the night of Monday, September 5, they shipped my board on Tuesday, September 13, and the package arrived on Friday, September 23.

I’m sure you’re wondering: how did the boards turn out? Well, they turned out just fine! As you can see, they were able to manufacture a board with an irregular shape:

You may see some numbers on the silkscreen. Seeed Studio’s directions tell you to add your order number to the silkscreen somewhere, so I did that on the bottom. They also added a few other numbers onto the top silkscreen. Not a big deal at all, especially for the price!

Electrically, the SIMM tested fine. All the traces were perfect. Seeed Studio will electrically test 5 of your boards for free, and you can pay to have the other 5 tested, too.

Oh, did I mention that they actually sent me 12 boards instead of 10? How cool is that?

I would highly recommend them if you are looking for budget PCB fabrication. They only do 1 or 2 layer boards, so you can’t do anything overly fancy with 4 or more layers, but 2 layers is plenty for all kinds of fun stuff you can make, such as this SIMM.

By the way, my SIMM ended up working perfectly, and it expanded my IIci’s ROM capacity to 2 MB (the original ROM is only 512 KB). I naturally tested the rest of the capacity by filling the remaining 1.5 MB with the Super Mario Bros music and turning it into the longest Mac startup chime ever:

If you’re curious what the assembled board looks like, I show it off briefly in the video.

So anyway, all in all, I had a very positive experience with Seeed Studio’s Fusion PCB service! I was shocked when I saw the pricing, and I am very happy with the boards I got back. Thanks, Seeed Studio!

 

Introduction

OpenOCD is an extremely useful (and free!) piece of software for microcontroller programming. In particular, I use it to program to and debug with various development boards I have laying around. I have a Luminary Micro/TI Stellaris LM3S2965 evaluation kit that has a built-in USB port which can be used for JTAG. The board also has a JTAG connector on it, so you can use it as a passthrough for JTAG on another board. The JTAG is powered by an FTDI FT2232D USB to serial chip. Just calling it a USB to serial chip is not giving it quite enough credit, though — it can be used for all kinds of crazy things, including JTAG. I would venture a guess that most USB to JTAG adapters you can buy are really just a simple board with this chip on it connected to a JTAG connector on one end, and USB on the other.

Anyway, OpenOCD is a bit interesting. It can interface to FT2232-based JTAG devices with two options for communication. The first option is libftdi, an open-source library that uses libusb to talk to the FTDI chip. The second option is ftd2xx, which is a proprietary library provided by FTDI itself (which also appears to be based on libusb?).

So naturally the question is: which one should you use?

Well, I think it depends on your specific needs. I first got OpenOCD working a couple of years ago in Ubuntu Linux. At the time, it didn’t really matter to me which library to use, so I tried them both. My experience was that FTDI’s own driver was much faster than libftdi. The situation may have changed since then, but I haven’t had a reason to change my setup.

When I got OpenOCD working on Windows 7 64-bit, it got even more interesting. At the time, libusb was not a signed driver, so it was a huge pain in the butt to get it to work with Windows 7. Apparently you could get it to work by starting up Windows with a special option in the same startup menu you’d use to boot into safe mode. But you’d have to do that EVERY time you booted, or the driver wouldn’t work. That’s my understanding of the situation, anyway. The other option was just to use FTDI’s driver and forget about it all. So I did.

Things have changed since then, though. libusb is now available as a signed driver that won’t make you jump through hoops on Windows. I haven’t tested libftdi lately to see how it compares performance-wise to ftd2xx. Anyway, I’ve stuck with ftd2xx just because it’s what I started with, so this post will be about the ftd2xx library. It’s very possible that my instructions will still work with libftdi though!

Anyway, here’s the other thing about the ftd2xx library. It’s a proprietary library incompatible with the license used by OpenOCD (GPL). So you won’t be able to find a (legally good) distribution of OpenOCD that has the ftd2xx library capability built-in. It’s perfectly OK to distribute OpenOCD that’s linked against libftdi since it’s compatible with the “viral” GPL. But since back in the day I needed OpenOCD with ftd2xx, I had to compile it myself.

I recently had to compile OpenOCD for 64-bit Windows again. That’s where this blog post comes in.

Compiling OpenOCD for Windows

There are a few options you can use to compile stuff designed for GNU tools in Windows. One example would be MSYS, and another would be Cygwin. These are both EXCELLENT environments, but I don’t like having to install a bunch of their stuff on my computer just for the sole purpose of having all the crazy tools necessary to be able to compile OpenOCD. I’d rather keep my Windows partition nice and clean to the point where I just have the OpenOCD executable and any supporting libraries, and that’s it. But how do you get past needing all the stuff necessary to run configure scripts, process Makefiles, and yadda yadda yadda?

You cross-compile it for Windows using Linux and MinGW! That’s how!

Here are the basic instructions I found on Dangerous Prototypes for cross-compiling OpenOCD for Windows from Linux. I tried to follow them step-by-step, but I ran into a few snags, so I decided to make my own post that will go step-by-step through compiling OpenOCD for 64-bit Windows (Vista and 7, and probably 64-bit XP too?). I see no reason why this shouldn’t work on a 32-bit platform as well–just substitute any 64-bit compilers/tools I specify with the corresponding 32-bit versions instead!

I had trouble getting 64-bit MinGW to work correctly in Ubuntu, so this post by Matpen on the Ubuntu forums also helped me immensely. I’m guessing if you’re doing a 32-bit compile, you don’t need to worry about the problems I had with the 64-bit MinGW included with Ubuntu. The version included with Ubuntu 10.04 had a really deep problem in that it doesn’t include libgcc_s.a, and the version included with Ubuntu 11.04 has problems such as not providing a getopt.h include file.

Finally, I had more trouble getting OpenOCD to link against the ftd2xx library after I solved the first problem, and this post by el_nihilo on the SparkFun forums also helped me out.

I’m hoping that by combining the instructions from these three sources, I can create a single source that anyone can go to in order to get OpenOCD to compile on the FIRST TRY. Here goes nothing!

Prerequisites (for Ubuntu 12.04)

Since I first wrote this article, Ubuntu 12.04 came out, and this process is MUCH simpler to get working with it. The supplied toolchains work fine now. All you need to do is:

sudo apt-get install mingw-w64

No need to install weird packages or regenerate FTDI’s .lib file. Great!

Prerequisites (for Ubuntu < 12.04)

You’re going to need an install of Ubuntu (32-bit or 64-bit — your choice). I have successfully tested this procedure on the 64-bit version of Ubuntu 10.04,  the 32-bit version of Ubuntu 10.10, and the 64-bit version of Ubuntu 11.04. Use whatever you’d like. You can do it on a separate partition, in a VMware Player virtual machine, or whatever else it takes to get Ubuntu.

To start, we’re going to install some packages that Ubuntu will need:

sudo apt-get install libcloog-ppl0 libgmpxx4ldbl libmpfr1ldbl libppl-c2 libppl7

Now, rather than use Ubuntu’s provided 64-bit MinGW package, which does not work (as of this writing), we will instead use a version which does. So get the two packages that apply to you and install them:

If you’re on a 32-bit version of Ubuntu:

wget http://ppa.launchpad.net/mingw-packages/ppa/ubuntu/pool/main/w/w64-toolchain/i686-w64-mingw32-toolchain_1.0b+201011211643-0w2273g93970b22426p16~karmic1_i386.deb
wget http://ppa.launchpad.net/mingw-packages/ppa/ubuntu/pool/main/w/w64-toolchain/x86-64-w64-mingw32-toolchain_1.0b+201011211643-0w2273g93970b22426p16~karmic1_i386.deb

dpkg -i i686-w64-mingw32-toolchain_1.0b+201011211643-0w2273g93970b22426p16~karmic1_i386.deb
dpkg -i x86-64-w64-mingw32-toolchain_1.0b+201011211643-0w2273g93970b22426p16~karmic1_i386.deb

If you’re on a 64-bit version of Ubuntu:

wget http://ppa.launchpad.net/mingw-packages/ppa/ubuntu/pool/main/w/w64-toolchain/i686-w64-mingw32-toolchain_1.0b+201011211643-0w2273g93970b22426p16~karmic1_amd64.deb
wget http://ppa.launchpad.net/mingw-packages/ppa/ubuntu/pool/main/w/w64-toolchain/x86-64-w64-mingw32-toolchain_1.0b+201011211643-0w2273g93970b22426p16~karmic1_amd64.deb

dpkg -i i686-w64-mingw32-toolchain_1.0b+201011211643-0w2273g93970b22426p16~karmic1_amd64.deb
dpkg -i x86-64-w64-mingw32-toolchain_1.0b+201011211643-0w2273g93970b22426p16~karmic1_amd64.deb

These commands will install special versions of MinGW that aren’t broken for producing 64-bit code. I’m not 100% sure if you actually need to do the i686 toolchain as well, but I’ve included them because the original steps from the Ubuntu forums also included them. The packages may say Karmic in the filenames, but I was also able to install them successfully in 10.04, 10.10, and 11.04.

Downloading OpenOCD and FTDI’s library

So now let’s grab OpenOCD and the FTDI library:

wget http://download.berlios.de/openocd/openocd-0.5.0.tar.bz2
wget http://www.ftdichip.com/Drivers/CDM/CDM20814_WHQL_Certified.zip

First, if you’re using the MinGW toolchain from before Ubuntu 12.04, you need to regenerate the FTDI library’s .lib file, because it and that MinGW version just don’t get along. If you have Ubuntu 12.04 and its built-in MinGW toolchain, skip down to “The actual compilation of OpenOCD.” If you have an Ubuntu version before 12.04 and don’t do this step, the final link of openocd.exe will fail with several errors similar to: undefined reference to `__imp__FT_Write’. Extract CDM20814_WHQL_Certified.zip and do the following commands:

cd CDM20814_WHQL_Certified/amd64
echo "LIBRARY ftd2xx64.dll" > ftd2xx64.def
echo "EXPORTS" >> ftd2xx64.def
strings ftd2xx64.dll | grep FT_ >> ftd2xx64.def
mv ftd2xx.lib ftd2xx.lib.old
x86_64-w64-mingw32-dlltool -d ftd2xx64.def -l ftd2xx.lib

If you go to the post by el_nihilo on the SparkFun forums which I mentioned earlier, you’ll see what’s going on. It has to do with the difference between how DLL function names are expected to be prefixed in Visual C++ and MinGW. It’s pretty goofy that there have to be all these stupid incompatibilities, but this clever little trick fixes it by regenerating the .lib file so it’s compatible with MinGW. This step was borrowed directly from that post.

Once all this preparation is done, it’s now safe to try to install OpenOCD!

The actual compilation of OpenOCD

Now, compiling OpenOCD is relatively straightforward!

Go into the OpenOCD directory, and do this command:

./configure --host=x86_64-w64-mingw32 --disable-werror --with-ftd2xx-win32-zipdir=/path/to/CDM20814_WHQL_Certified --with-ftd2xx-lib=static --enable-ft2232_ftd2xx --prefix=$PWD/_install

Let’s step through the options we gave to the configure script so we make sure we understand what it’s doing.

  • --host=x86_64-w64-mingw32

    specifies that we are compiling for a different host that we’re running on now. This is how we tell it to use the MinGW cross compiler.

  • --disable-werror

    makes it so a compiler warning is not treated as an error. We will get some compiler warnings which would cause the build to fail if we didn’t provide this option. The warnings seem to be harmless; it works just fine!

  • --with-ftd2xx-win32-zipdir=/path/to/CDM20814_WHQL_Certified

    is just the path to the folder containing the contents of the extracted zip file with the ftd2xx driver and libraries. It’s telling the configure script where to find the header file and library to compile/link against.

  • --with-ftd2xx-lib=static

    supposedly tells it to link OpenOCD against the static library instead of the dynamic library. Despite that fact, I still have to include ftd2xx64.dll with the final generated binary. Not sure why, but it works!

  • --enable-ft2232_ftd2xx

    tells the configure script that we’re going to want FT2232 support provided by the ftd2xx proprietary library.

  • --prefix=$PWD/_install

    just tells it to install in a directory called _install in the current directory instead of the default location of /usr/local. Since we’re just going to be transporting it over to the Windows computer when done, it makes no sense to install it in /usr/local.

So, do it!

Then, when it’s done and succeeds (hopefully), compile it:

make

and install it:

make install

Now, your OpenOCD directory should have a directory called _install inside of it, containing several directories including (almost) everything you need to get OpenOCD running on Windows! The only thing it’s missing is ftd2xx64.dll. Grab it from the CDM20814_WHQL_Certified/amd64 directory and stick it in the bin directory alongside openocd.exe. At this point, you should be able to transfer this directory over to your Windows partition/machine/whatever and run OpenOCD just like any other random Windows console app.

Hope this helps someone else who, like me, was struggling to get it to work!

In my last post, I wrote about how I had figured out how the Macintosh IIci’s synthesized startup sound works. I talked about how I replaced the ROM chips on the motherboard with sockets and disassembled the ROM code. I shared a video showing how I customized it to play the first few notes from the Super Mario Bros song. At the very end of it, I talked about how it would be awesome to figure out how to play a sampled sound at boot time, which would allow me to play any sound I wanted, limited by space available in the ROM. I didn’t feel very optimistic about getting that done, but after a ton of reading, experimentation, and frustration, I have figured it out, and my IIci now plays a sampled startup sound when it boots up. I’d like to share how I did it, as well as provide instructions on how to patch your own IIci’s startup sound with the sound of your choosing.

Before I start, I’d like to thank the people in the 68k Mac Liberation Army forums for the help and encouragement, and also the people who coded the MESS emulator for leaving a very handy register reference for the Apple Sound Chip in their source code. Without any of the aforementioned people, I would never have been able to get this far in my hacking!

Here’s are some videos, followed by how I did it and how you can do it. I decided to inject one of the sampled startup sounds from the LC/Performa series into my IIci:

And here is the sampled startup sound from the 5200/5300/6200/6300 series Macs:

How I did it

To begin, I disassembled an LC III ROM dump. The LC III is one of the oldest Macs that has a sampled startup chime. Its sound chip is not quite the same as the IIci’s, but the way it plays its sampled sounds at boot time gave me a few clues. To write sampled sounds to the Apple Sound Chip, you basically just keep writing samples, one at a time, as long as there is room in the chip. You determine whether there’s room in the chip or not by reading a status register.

So based on how the LC III did it, I wrote a program to test playing a sound by talking directly to the sound chip on my IIci. It failed miserably for two reasons:

  1. The bits of the FIFO status register in the sound chip act slightly differently on the LC III compared to the IIci
  2. When I’m booted into the Mac operating system, there are interrupts looking at the status of the sound chip. An interrupt can grab the FIFO status before I get a chance to see it, and then I’m stuck waiting in an infinite loop for a bit to change, even though it already changed long ago.

No biggie though! First of all, I decided to forget about the FIFO status register completely, and instead I made my program write samples blindly to the chip, adding a pause occasionally to give the chip some time to play the samples and clear space in its FIFO. I messed around with my delays, and I eventually was able to get it to mostly work. By mostly, I mean sometimes it would make some crackly sounds while playing the sound I wanted it to play. I figured this was because interrupts sometimes made my delays longer than they should have been, causing the FIFO to empty out for a short time. My main goal was accomplished though: I knew how to tell the chip to play sounds.

The delay code was lame, though. It makes more sense to listen to what the chip is telling me, rather than guess the status of the chip based on time delays. To solve problem #2, I disabled interrupts during my program. This helped immensely because it prevented the operating system from talking to the sound chip behind my back. Next, I started playing with the status bits to see when they come on and turn off. It was here that I discovered problem #1: The IIci’s sound chip doesn’t behave the same way the LC III’s code implies its sound chip behaves.

The LC III’s code always waits for one of the FIFO status bits to be “1” before it writes a sample to the chip–even the very first sample. The MESS source code says this bit is a “FIFO half empty” status bit. On the LC III, it looks like this bit is 1 any time the FIFO is anywhere between completely empty and half empty. If it’s more than half full, the bit is zero. On the IIci, though, this bit stays at zero, and only becomes 1 once you have filled the FIFO more than half full and it has dropped back down to being only half full by playing enough samples. Plus, once you read the bit, it goes back to 0 until the FIFO has filled more than half full again (and dropped back down to half full after that, which sets the bit at 1 again). This explained why my first attempt failed — an interrupt probably read the status bit’s 1 value (bringing it back to 0 in the process) and I was stuck waiting to see it and never did.

Once it’s finished writing all the samples to the chip, the LC III’s code waits for another status bit to be 1 before continuing on. I’m guessing that this other bit acts as a “FIFO is completely empty” bit, so it’s allowing the code to wait until the FIFO has totally drained out. I don’t know for sure, but that would make the most sense based on what the code is doing. On the IIci, though, according to my tests, this bit is zero until the FIFO is completely full. Then it becomes 1 until you read it, and it resets back to zero immediately.

Based on this information, I decided on an algorithm to use with the IIci to play sampled sounds driven totally by the two status bits:

  • Check the “FIFO is full” bit.
  • If the FIFO is not full (the bit was 0), write the next sample to the chip and start over again at the top.
  • If the “FIFO full” bit was 1, then wait until the “FIFO is half empty” bit comes on, then write the next sample to the chip and start over again at the top.

So the algorithm starts with the FIFO completely empty, fills it completely up, then waits for it to become half empty again, fills it completely up, waits until it’s half empty, fills it up, and so on, until it’s done.

I couldn’t find a way to determine that the FIFO was completely empty, though–so I may be cutting off the end of the sound. I’m not sure about that yet. I don’t notice it, but it’s possible that the sound I’m playing has some silence at the end of it anyway.

So after getting it working in a simple Mac program, I injected the code into the free space in the IIci’s ROM and patched the ROM to jump to my routine instead of the normal startup chime. After a monumental screwup where I accidentally commented out a single line that caused the whole thing to fail and had me puzzled for hours, I got it to work on my second try!

A couple of days later I tried another sound. The 5200/5300/6200/6300 startup chime is actually sampled at 11.127 KHz, which is half the Apple Sound Chip’s standard sample rate. So to play it, I changed my code slightly to write each sample into the sound chip twice, doubling the effective sample rate to 22.254 KHz. This also lets me use a sound twice as long!

So that’s my background info on how I did it. Now…I’m sure you’re chomping at the bit to do it yourself, right?

How you can do it

You need:

  • A working Mac IIci
  • Some soldering and desoldering skills
  • Four 32-pin 0.1″ pitch DIP sockets
  • Four pin-compatible EEPROMs (I used the Greenliant GLS29EE010)
  • An image of your Mac IIci’s ROM — read it from the chips after removing them or read it from the Mac before you tear it apart. (Do NOT ask me for ROM images – I’m not interested in infringing on Apple’s copyright)
  • An EEPROM burner compatible with the EEPROMs you choose and a computer that it can talk with — most likely it will be a Windows-based computer with a parallel port.
  • The ROM patches I will provide below
  • A hex editing utility to stick the patches where they belong (I used HxD)
  • A tool to recalculate the ROM’s checksum after your modification. Ben Boldt’s Mac ROM Checksum verifier will help you figure out what the checksum should be (nice work Ben!) — note: if you run this program on Windows, change the code so that it opens the files in binary mode. Otherwise it won’t work because it will do weird stuff with bytes that happen to be carriage returns and line feeds.
  • A tool to split the ROM file into four interleaved segments for burning. I made one for .NET that also does the checksumming but I don’t have time to upload it yet…if you don’t want to wait, make a suitable tool yourself! 🙂 See my previous post for info on how the ROM chips are interleaved.

All right! As far as hardware goes, desolder the old ROMs and remember where each one goes. Solder sockets in their place and put the original ROMs into the sockets to make sure that it still boots OK. Each EEPROM will have at least one extra pin that was not connected on the IIci’s original ROM. It will need to be connected to something and not left floating. See the datasheet of your chip to determine what it should be connected to for “read mode”. In the case of the GLS29EE010, it is a pin right next to VCC that is used for programming (the pin is called WE#), and in read mode it is supposed to be pulled high. How convenient! On the bottom of the motherboard, you can blob solder between these two pins to connect them and it should work well–at least for the GLS29EE010!

Now, read the ROM files onto your computer (or use the image you read from the Mac before tearing it apart and split it into 4 interleaved files). Burn the four ROM segments onto your EEPROMs, stick them in place of the original ROMs, and make sure the IIci still works. This will verify that your EEPROM burner and the hardware modifications are working correctly.

From this point on, it’s all software. First of all, here is the source code I used to generate this binary blob of sound-playing goodness (if you want to compile my code to make changes, you will need a 68k version of GNU binutils):

Sampled startup chime source code for Mac IIci

Here is the final assembled 140-byte binary generated from the source code:

Sampled startup chime binary ready for injection into Mac IIci ROM

You will need to append your own sound to the end of this binary blob. It should be in raw 8-bit unsigned format with a 22254 Hz sample rate. Audacity should be able to produce a file in that format for you [File–>Export–>Other uncompressed files–>Options–>RAW (header-less)–>Unsigned 8-bit PCM], and it should also be able to take a sound sampled at a different rate and convert it to 22254 Hz [Tracks–>Resample]. After the end of my binary blob, append the length of the sound (in number of samples) as a four-byte big-endian integer. Then append the raw 8-bit unsigned file you generated containing your sound. I decided to skip Apple’s sound resource format or AIFF or WAV or anything like that because it would do nothing except complicate the sound playing code.

Stick your combined blob into your ROM image (overwriting the old bytes so the file does not grow in size) starting at an offset of 0x51D70 (you should see an Apple copyright notice repeated over and over again along with other stuff). It can’t be longer than about 35 kilobytes, though, or it will start overwriting other stuff in the ROM. Make sure that some of Apple’s copyright notice is still visible and you’ll be fine! 🙂 If you don’t yet have a single ROM image because you read the ROM from the original chips, you need to combine them into a single file first — remember, they are interleaved!

Next, we need to patch the ROM image to jump to our newly-injected code instead of the standard startup chime code. It’s a simple matter of changing the four bytes starting at the offset 0x435D2. In your ROM image they should be:

FF FC 3A 82

You need to change them to:

00 00 E7 A0

Also, if you plan on using this patched IIci ROM in an SE/30 (and I’m guessing also a IIx/IIcx, but I can’t confirm that), you also need to change the four bytes starting at the offset 0x4122C. They should already be:

FF FC 5E 28

Change them to:

00 01 0B 46

Finally, recalculate the checksum using Ben’s tool I linked to above, split the ROM image into four segments, burn the segments to EEPROM, stick them in your IIci, and power it on! If all goes well, you will hear your awesome new startup sound. Otherwise, you probably have some troubleshooting to do 🙂

You know what’s really cool about this hack? I ended up using a Mac, Windows, and Linux together to do the job. Just the way I like it!

Update: I got sampled startup chimes working since I first posted this hack. See my latest post.

There’s been a cool project I’ve been working on for the last month or so, and I finally got to a good point where I could share what I’ve done! In summary, I have given my old Macintosh IIci a custom startup chime. Get ready for a long-winded post if you dare to read it all…or skip to the bottom if you just want to see and hear my custom IIci start up.

I’ve seen the question asked online. How do you change that sound a Mac makes immediately after you press the power button? The answer is inevitably “you can’t.” The sound is stored in the Mac’s ROM, making it impossible to change in software. It’s hiding somewhere in read-only memory chips (usually) soldered onto your Mac’s motherboard, nestled alongside other code that sets up all the Mac’s hardware when you first turn it on.

A year ago, I was talking with a coworker and we thought it would be awesome to try to hack the hardware to change the sound. I decided at that time that the IIci would be a perfect candidate because first of all, I have two of them. Second, its ROM chips are DIP (dual-inline package) chips, which I feel safe soldering. I just don’t have enough surface-mount soldering experience to risk damaging a motherboard. The big problem was that I found out the IIci’s startup sound is not a sampled sound. That means it isn’t just a sound file embedded into the ROM that you can easily replace. Instead, the Mac creates the sound from scratch, given a few instructions about what notes to play and how fast they should play. I’ll talk more about that further down in this post.

Well, about a month ago, I started thinking about this idea again and decided to go for it. I started out by booting up my old IIcis and deciding which one to hack. One of them had a very faint startup chime, it wouldn’t boot up unless left unplugged for a while, and the hard drive wouldn’t spin up. I decided to hack this one because it was in the worst shape of the two. After checking out the 68k Mac Liberation Army forums, it became clear that the faint startup chime was probably because the electrolytic capacitors on the motherboard had spewed electrolyte all over. I inspected the motherboard and sure enough, many of the capacitors had gunk nearby. Here’s an example of what I’m talking about:

Gross, right? This one happens to be right next to the power button. Perhaps it was causing the power button to work intermittently. Several of the other capacitors looked very similar, and in some cases it appeared that some of the electrolyte had actually gotten onto nearby components. This stuff is nasty and can corrode the motherboard over time, so I knew I had to get it off.

You may notice that these capacitors are surface-mount, and I just got done explaining how I’m not very comfortable with surface-mount soldering. Well, this isn’t terribly complicated soldering. I get nervous when I’m doing components with tens of pins very close together. This stuff wasn’t so bad–there’s a connection on each side, so you really just have to be careful about lifting pads. I did kind of lift one, but it is still electrically connected and everything’s fine. Anyway, I removed all the capacitors (including the through-hole ones as well). I could definitely tell the capacitors were bad when they began smelling like fish as I removed them. After taking them all out, I cleaned the board with 99% isopropyl alcohol (great for cleaning circuit boards! It evaporates almost instantly). I especially went over the areas that looked like the one pictured above. I then put in brand new capacitors.

While I had the board out, I removed the four ROM chips:

Removing them was pretty interesting. I did a lot of research on the web to see what other people thought about removing DIP chips in old forum posts, tutorials, videos, etc. The attitude I found online was mostly: “you can save the board or the chip, but not both.” Most suggestions were to cut the old chips out and then easily remove the remnants of the pins with a soldering iron. I also found a few videos of people who had vacuum desoldering guns — in particular, the Hakko 808. The thing had great reviews, so I got one. It’ll be nice to have in my set of tools anyway. It requires a lot of maintenance to keep it working and prevent the pump from getting messed up, but it’s definitely worth it just for the convenience. I really wanted to get the original ROM chips out intact, too, so I’m happy with that. I ran into a bit of trouble with either the ground or VCC pin (can’t remember which) because it takes longer to heat, but other than that, it was pretty easy to use it to remove the chips. I added a bit of extra solder to help the old solder heat up, and the gun sucked most of it out painlessly after that. Sometimes the chips did not lift out easily, even though I couldn’t find any visible solder with my eye loupe. I could tell that the solder was essentially gone, though, so I yanked the chip out with a bit of force to break the miniscule connection the solder still had, and it didn’t damage anything. That may have been stupid on my part, but it worked out for me!

So I replaced the ROMs with sockets:

Then, I stuck the original ROMs back in the sockets, and booted the IIci up. Success! Not only did it boot, but the sound was no longer faint and the power button worked every time, so the capacitor replacement fixed that. However, I knew I’d be using some kind of EEPROM in place of the original ROMs. Notice that one of the original ROMs says it is a TC531001CP. I somehow managed to find a (really old) datasheet for it online and discovered it uses a standard pinout that most 32-pin DIP devices also use.

I found a couple of reprogrammable SST chips of the same capacity which seem to fit the pinout and electrical specs: the SST27SF010 and the SST29EE010. It looks like Greenliant broke off from some of SST’s flash stuff, so they are actually now the GLS27SF010 and the GLS29EE010. I started out with the GLS27SF010. It has a couple of extra pins for programming the chip that are not connected on the IIci’s original ROM, so I had to tie them to VCC on the bottom of the motherboard. The GLS27SF010 was really annoying to program with my Willem programmer, though, because I had to change a jumper on the programmer board any time I wanted to erase it, and remove it to program it. So I switched to the GLS29EE010, which is a lot easier to use, is rated for many more erase cycles, and only has one pin that has to be connected to VCC. I read the contents of the original ROMs, burned the contents to my EEPROMs, stuck them in the motherboard, and it booted!

This completed the hardware portion of my hack. At this point I knew I could reprogram my EEPROMs, stick them in the IIci, and get it to do cool stuff.

I figured out that the four ROM chips are each 8-bit ROMs and the computer itself addresses 32 bits at once. How this works is one of the ROMs is connected to data lines 0 through 7, one is connected to 8 through 15, another is connected to 16 through 23, and the final one is connected to data lines 24 through 31. This means that as far as what the computer sees, to turn the four ROM images into a single ROM image I could disassemble, I had to take one byte at a time from each chip:

Chip 1, Byte 0
Chip 2, Byte 0
Chip 3, Byte 0
Chip 4, Byte 0
Chip 1, Byte 1
Chip 2, Byte 1
Chip 3, Byte 1
Chip 4, Byte 1

I wrote a quick program to do this and was on my way.

In order to get any further, I had to turn to posts by Dennis Nedry on the 68k Mac Liberation Army forums. After following his posts around I was able to discover he had done a lot of the dirty work for ROM hacking. It turns out that Macs have a checksum in the first four bytes of their ROMs, and if it doesn’t match, the Mac won’t start. He had already figured all this out and written code to calculate and check the checksum of a ROM image. Without his work, there’s no way I would have gotten any further — I probably would have given up and just said “whatever — Apple’s protecting their ROM from being modified, I guess I’ll go mess around with something else.” Anyway, he had a ton of useful information here and here.

I started out with the basics and changed an icon. After some crazy searching in the ROM image, I was able to locate the floppy disk icon that appears if you boot a Mac without a startup disk present. There are actually three icons: a floppy disk, a floppy disk with a question mark on it, and a floppy disk with an X on it. When there’s no startup disk present, it alternates between the plain floppy disk and the floppy disk with a question mark. I changed the plain floppy disk icon to a black Apple logo, and the question mark floppy icon to a white Apple logo. Then I recalculated the checksum, split the ROM image into four files, burned them, stuck them into the IIci, and got this:

OK, that’s great! I had proven that I could do it. But then I wanted to move on to bigger and better things–my original goal. The custom startup chime.

At this point, I have to once again thank the people on the 68k Mac Liberation Army forums who gave me some great pointers on what to look for in my post. Trash80toHP_Mini was kind enough to scan a bunch of pages of an old out-of-print Apple hardware book for me as well. Let me tell you, there are some really cool people on those forums!

I started disassembling the ROM code and searching for places that referenced the Apple Sound Chip located at 0x50F14000 (and other repeated addresses as well). It took a while to get comfortable with 68k assembly, but it’s really not that bad. It’s actually kind of nice compared to x86 assembly in my opinion. This site was an invaluable reference. In the middle of this, I installed MPW (Apple’s old software development setup) and happened to discover that it included several Mac ROM maps! The IIci was one of those. I found some interesting labels in the ROM maps called BOOTBEEP, BOOTBEEP6, ERRORBEEP1, ERRORBEEP2, ERRORBEEP3, and ERRORBEEP4. Using that info, I later discovered that the ROM location I was in the middle of disassembling did indeed happen to be the startup chime.

Once I got a clue as to what the code was doing, I figured out a way to play the various chimes provided by writing a Mac app that played them. It turns out BOOTBEEP is just a handler that plays BOOTBEEP6, which is the actual startup chime. ERRORBEEP1 is the minor chord that plays in the chimes of death before the arpeggio. ERRORBEEP2 is a weird tone I had never heard before. ERRORBEEP3 is that same tone, with another note added at the end. ERRORBEEP4 is the familiar arpeggio error chime.

Each one of these sounds is a structure in the ROM that is passed to a ROM sound synthesis function that plays it using the Apple Sound Chip. When you hear the chimes of death, it’s actually playing two sounds consecutively: first ERRORBEEP1 for the chord, then ERRORBEEP4. After doing some serious disassembly of the sound synthesis function, I more or less understand what it does. The MAME source code was extremely useful to help me understand some of what the function was telling the Apple Sound Chip to do.

The synthesis function uses the wavetable synthesis capability of the sound chip. It is given one to four frequencies to play. It also is given several time values. One tells it how long before the wavetable should be updated. I believe it essentially specifies how quickly the waveform evolves and eventually fades out to nothing. Another time value sets how many steps should occur between playing each of the frequencies it’s given. Once a frequency is playing, it leaves it running and starts the next frequency in another voice. So at the end of the chord, all four voices are playing. If you make this a small number, it will play them all close together and it will be a chord like the startup chime. If you make it a large number, it will play them spaced apart, more like the death chime arpeggio. There is also a time value that specifies how long the sound should play in total. This would allow you to keep the sound going for a while after all the notes are playing. The frequencies are specified in terms of a 24-bit fixed-point increment value, basically telling the wavetable synthesizer how many entries in the wavetable to skip each time it reads a value. This effectively changes the frequency of the sound that plays based on what you set the increment value to.

It may sound complicated (it is!) but what it comes down to is this: the startup chime on a Mac IIci takes up 32 bytes of ROM space (not counting the synthesis function, which is reused for all the other startup sounds like the error chime). That’s pretty impressive, and you can see why they went that route to save ROM space.

So, here’s some info on the structure of a synthesized sound in the ROM:

  • A 16-bit number that seems to specify some kind of a volume setting.
  • A 32-bit number that sets how fast the waveform changes (a “step length” perhaps?)
  • A 32-bit number that sets how many steps elapse before starting to play the next frequency in the list of frequencies
  • A 32-bit number that sets how many steps before the sound is done.
  • A 16-bit number that specifies the number of frequencies (maximum of 4, since the sound chip has 4 voices)
  • For each frequency, a 32-bit number that contains the 24-bit fixed-point increment value I described earlier. The top 8 bits are not used.

Once I figured all this out, I just had to find some space in the ROM to put my Mario tune. I found what appears to be 35 KB of empty space, filled with many repetitions of an Apple copyright notice and a date. I made several of these sound structures for all the different chords I needed to play, and stuck them in there, along with code to play them in the correct sequence. I then modified the startup chime code to jump to my code instead. Here are the results:

Yes, that’s right. a Super Mario Bros IIci. It reboots after the first Happy Mac because I don’t have a PRAM battery installed on the motherboard. After that it continues booting fine.

I’d like to figure out how to make it play a sampled startup sound, but it may be difficult to figure out how to set up the sound chip for that mode. (Forget that, I now have sampled startup sounds working!) Anyway, let me know if you have any questions and I’ll try to answer them…and thanks again to everyone who helped me out with this project either directly or indirectly!

P.S.: Remember how I said the hard drive wouldn’t spin up? Well, I found a really, really old newsgroup posting online where someone with the exact same drive model had the exact same problem. The advice given? Give it a smack (not too gentle, but not too rough either) while it’s powered up. I hit it with a screwdriver and sure enough, it started spinning up. It’s been fine ever since through many power cycles! All the data was still intact. I certainly wouldn’t try that trick with the hard drives we have today, but I guess with stuff from the lower-density storage era, it’s not a terrible idea!

SPI. You may have heard the acronym before. I pronounce it letter-by-letter: “S-P-I”. I think I had heard of it in the past before I learned how to program microcontrollers, but I had no idea what it was. Everyone at work was talking about how we use an “SPI flash chip” or an “SPI driver chip”. Well, eventually I did have to learn what it was, so I’ll try to explain it as easily as I can.

SPI stands for Serial Peripheral Interface. Let’s break it up into two parts:

Peripheral Interface

Peripheral interface means it’s a way to talk to peripherals using your microcontroller. It’s an interface for peripherals. You might have a temperature sensor chip that you need to receive readings from, or an accelerometer, or external memory storage such as a flash chip (like a computer’s BIOS chip, for example). Any of these could be considered to be peripherals. You could even consider your microcontroller to be the peripheral — more on that later.

Serial

Serial refers to the method that is used to communicate between your microcontroller and the peripheral. If you’re like me, you’ve heard of data transfer being “serial” or “parallel” — for example, older computers usually had two serial ports and a parallel port. It breaks down like this: when computer data is sent serially, you’re sending data over a single wire, a bit at a time. When computer data is sent in parallel, you’re sending multiple bits at once. For example, if you have eight wires, you could transmit a byte at a time by putting each of the eight bits in a byte onto a corresponding wire, a 1 being represented by a “high” (5V) value, and a 0 represented by a “low” (0V or ground) value. That would be parallel communication. If you put one of the bits onto a single wire, waited a short time, put the next bit onto the same wire, waited a short time, and so on, that would be considered serial communication.

SPI is a serial protocol because communication between your microcontroller and the peripheral happens over a single wire in each direction. There’s one wire for data transmission from your microcontroller to the peripheral, and there’s another wire for data transmission from the peripheral back to your microcontroller. You might be wondering: isn’t that parallel if there are two wires? Well, each wire is in a different direction so that doesn’t really count.

OK, so now we have that out of the way. Let’s dive into some more terminology.

There are two types of SPI devices: masters and slaves. On an SPI bus, there is one and only one master. It’s in control of all communication. There can be multiple slaves, but there should be at least one — otherwise the master doesn’t have anything to talk to.  The master decides which slave it wants to talk to. It can only talk to one slave at any time (except under certain circumstances when slaves are daisy-chained together — more on this later as well).

In most cases, your microcontroller will be the master, and the peripherals will be the slaves. You could, however, communicate between two microcontrollers with SPI by letting one be the master and one be the slave.

SPI uses these four wires:

  • CLK (clock)
  • MOSI (master out, slave in)
  • MISO (master in, slave out)
  • CS (chip select)

Actually, there is a separate chip select line for each slave you want to talk with. So if you have three slaves, you actually need a total of 6 wires — CLK, MOSI, MISO, and a CS wire for each slave.

MOSI and MISO are pretty straightforward. Data sent out of the master to the slave is transmitted over the MOSI line (master out, slave in). That makes sense because the data is going out of the master and into the slave. Likewise, data sent from the slave to the master will be transmitted over the MISO line (master in, slave out). Again, that makes sense because the data is going out of the slave and into the master.

The chip select line should make sense too, because that’s how each slave knows whether the master is talking to it or not. Basically, the master leaves all the chip select lines high when not talking to any slaves. When it decides it needs to talk to a slave, it brings that slave’s chip select line low, leaving all the other chip select lines high. That way, that particular slave knows the master is talking to it, so it knows that it should be the slave to respond. All other slaves will ignore any incoming data. Note: some slaves expect the opposite behavior: the CS line would normally be low, and only high when talking to the slave. You have to check the slave chip’s datasheet to see how it operates. The terminology here is that if a slave’s chip select line is asserted, it means that the master is talking to it.

That leaves us with the clock line. The clock line is probably the most important line of all the SPI lines. It is what handles the timing. The clock line alternates between high and low, and is controlled by the master. It is how the slave device determines when it is time to read the MOSI line to see what bit got sent to it by the master, and also how it knows when to change what it has written to the MISO line to send a bit back to the master. Since the master is in complete control of the clock, the slave needs to (pretty quickly) respond properly whenever the clock line changes. For this reason, slave devices will specify a maximum clock rate. The maximum clock rate is referring to how fast the master is allowed to flip the clock line between high and low. The master should not flip the clock line any faster than what the slave specifies as its maximum clock rate — otherwise, weird stuff will occur because the slave probably won’t be able to respond quickly enough.

Having a clock line might seem kind of weird. With other types of serial communication, there isn’t a separate wire for the clock. For instance, in a standard RS-232 PC serial port, there is not a clock wire. In that form of serial communication, both ends of the communication have to know ahead of time what the clock rate is. They stay in sync with each other because they both know how long the delay should be based on that predetermined clock rate, combined with a small delay between successive characters sent. On the other hand, as I already said, with SPI the master is in control of everything including the clock. The master decides how fast data is sent and received (as long as it is within the tolerable limits of both the slave and master). This whole setup is possible by having a separate wire just for the clock.

SPI communication is usually 8- or 16-bit, but it could be any number of bits. By that, I mean one complete message may be sent after 8 total bits have been sent and received to/from the master. It all depends on how the slave has implemented SPI. What this means is you really have to carefully study the slave device’s datasheet to determine how to configure the master to talk with it.

There is one other concept that might be confusing at first: the slave is always sending data back to the master at the same time the master is sending data to the slave. Every time the master sends a bit to a slave, a bit comes back in from the slave. If the slave needs to know what all the bits are before it can do something, what it sends back to the master might not mean anything until the master sends another set of bits, which will then give the slave device an opportunity to reply. You’ll see what I mean in this example:

Let’s do an example. Say you are a master device communicating with an SPI temperature and humidity sensor. How would you read the temperature and humidity data from the sensor? We need to know the communication protocol that the sensor uses, which will be defined in its datasheet. For now, I’ll make up a protocol.

Let’s say that the SPI temperature and humidity sensor accepts eight bits at a time (one byte):

  • 0x52 means “read the temperature”
  • 0x53 means “read the humidity”

So you will send a byte to the sensor to tell it which reading you would like to see–either 0x52 or 0x53. If you send it anything other than 0x52 or 0x53, you will get garbage back (or maybe all zeros). So let’s read the temperature by sending 0x52 to it.

So you send 0x52 over SPI to it. In binary, 0x52 is 01010010. So you will assert its chip select line. Next, one-by-one, you will set the MOSI line to:

0
1
0
1
0
0
1
0

(toggling the clock line as you go). Meanwhile, each time you send a bit to the sensor, it is responding with a bit. However, since the sensor does not know which command you are telling it until the entire byte has arrived, it will just reply with zeros for now. So you receive a reply of 0x00 (eight zeros) on the MISO line, and ignore it since it doesn’t mean anything. Finally, you will deassert its chip select line to let it know that you’re finished.

Now, the sensor knows which reading you wanted, but since the master is in control, the slave is not allowed to just send it to the master. Instead, the master has to initiate another transfer to allow the slave to send the reading back. So the master will send another byte, which can actually be anything (the slave will ignore the bits coming in over the MOSI line — all it knows is that it will send the temperature reading out over the MISO line). So you assert the chip select line, then send all zeros (or 0xAB or 0x15 or whatever else you want), and it replies with:

0
1
1
0
0
0
0
1

or 0x61, which is 97 in decimal. This corresponds to a temperature reading of 97 degrees Fahrenheit. Finally, deassert the chip select line. Now you can repeat the same process to read the humidity, or to read the temperature again. Get it? It’s really not that tough. Note that I made up the protocol in this case, and other chips may behave differently. This is just one example, and it’s very similar to how a GPIO expander chip I have used in the past works–you send it a command, then you send it a dummy byte to read back the results of the command.

I promised that I would talk more about two other concepts earlier: allowing your microcontroller to be the slave, and also daisy chaining. Here we go:

Your microcontroller could actually be a slave device. In that case, it would monitor the chip select line to see if a master is talking to it. Then, it would monitor the clock line, writing and reading bits from the MISO and MOSI lines as necessary based on how the clock line changed. You could easily use this type of thing to implement communication between two processors, although it might be a little overkill when you could do the same thing with a UART (a normal serial port).

Daisy chaining allows you to talk to multiple chips at once. An example would be if you have two of the same type of chip connected to each other like so (also keeping the clock and chip select lines connected to both chips at the same time):

In this picture, the MOSI line of the master is connected to the MOSI line of the first slave. This part is normal. But here’s where it gets weird: the MISO line of the first slave is connected to the MOSI line of the second slave! So data coming OUT of the first slave will go IN to the second slave. Finally, the MISO line of the second slave is hooked back into the master microcontroller. Essentially, any time you want to talk to the chips, you send data for each of the chips in sequence BEFORE deasserting the chip select line. So if you have two chips hooked up, you would send two bytes. On chips that support this, this will cause the first byte to go to the slave farthest away from the microcontroller, and the second byte will go to the slave that is connected directly to the microcontroller’s MOSI line. The first chip serves as a pass-through to the second chip, but it holds on to the last byte it receives. Finally, when you deassert the chip select line, each chip will actually interpret the byte it receives. You could do this with a countless number of slaves — it’s not just limited to two. Likewise, when you read from them, you will read multiple bytes. The first byte will be from the chip farthest away in the chain from the master, and the second byte will be in the next closest chip, and so on, until you’ve reached the chip that is closest to the master. That’s really all there is to daisy-chaining.

So with SPI, do you manage each of the four lines on your own? Do you manually control the MOSI, MISO, Chip Select, and Clock lines on your microcontroller, manually toggling the clock line using the GPIO peripheral built into your microcontroller? You absolutely can do it that way — it’s called bit banging. It basically means that you implement the four wire protocol all on your own. You handle the timing of the clock line, when you assert chip select, and also the timing of when you change what’s on the MOSI line and read what’s on the MISO line. However, you would be crazy to do it that way on most microcontrollers in most cases.

Most microcontrollers have at least one memory-mapped SPI peripheral built in. You configure it by telling it the clock rate, how many bits are transmitted per transmission, and other information such as whether the chip select line should be LOW when asserted or HIGH when asserted, and it handles everything for you! After setting it all up, you can simply write a byte to one of the peripheral’s registers, and it will send the data out perfectly, letting you make more efficient use of your CPU’s time instead of worrying about timing. Then, you can determine when the transfer is complete and read the data that the slave sent back. However, I’ve gone on long enough in this post. This post was simply an answer to the question “what in the world is SPI?” In my next post, I will actually show you how to use the SPI peripheral built into most microcontrollers so you don’t have to bit bang the protocol yourself.

The last time I talked about interrupts, I kind of described what interrupts are. I never really got into how to use them, though. In order to use an interrupt, you write an interrupt handler — a piece of code that the microcontroller jumps to when an interrupt occurs. How that interrupt handler is set up depends on which architecture you’re programming for. In any case, when writing it in a language like C, it’s basically a special function that may need some extra code at the beginning and/or end.

The trick with an interrupt handler is that when it’s done running, it needs to leave the processor in exactly the same state it was in before the interrupt occurred. Recall that a single C instruction may break down into multiple assembly instructions that will likely involve modifying values in the microcontroller’s registers. Let’s say we’re incrementing a variable stored in memory. It will turn into three raw instructions. Let’s assume the compiler decides to use register 2 to modify this variable:

  1. Load the variable from RAM into register 2.
  2. Add 1 to the value stored in register 2.
  3. Save register 2 to the variable in RAM.

Let’s do a concrete example using this process. Let’s say that the variable stored in memory contains the value 200. Without worrying about interrupts, here’s what happens:

  1. Load the variable from RAM (it contains the value 200) into register 2. Now register 2 contains “200”.
  2. Increment register 2. Now register 2 contains “201”.
  3. Save register 2 back to RAM. Now the variable in RAM contains “201”.

That’s all fine and dandy. Now let’s say an interrupt occurs between steps 2 and 3, and it doesn’t properly restore the state of the CPU:

  1. Load the variable from RAM. Now register 2 contains “200”.
  2. Increment register 2. Now register 2 contains “201”.
  3. INTERRUPT! The interrupt handler runs, and it did some stuff that used register 2. It didn’t save the original value of register 2, so now register 2 contains whatever the interrupt left it at — let’s assume it’s 1234.
  4. Save register 2 back to RAM. Now the variable in RAM contains “1234”.

In my first interrupt article, I had a very similar example, but you need to understand why this example is different. In the first article’s example, the main program was busy modifying a variable in memory the exact same way this one was modifying a variable in memory. However, the interrupt handler was also writing to that same variable in memory. Because of the possibility of the interrupt handler changing the variable while the main program was also busy changing it, I had to protect against that possibility by temporarily disabling interrupts whenever I was modifying the variable in the main program.

In this example, however, the interrupt routine didn’t care about the variable in memory. It was doing some arbitrary operation — anything. Whatever the ultimate goal of the interrupt routine, it had to change register 2 to get it done. Unfortunately, it didn’t restore register 2 to the value it originally had. After the interrupt routine, the main program went along happily, totally unaware that the register’s value had changed. In a real-world situation, this kind of a bug would likely screw up several different registers, unless the interrupt routine was very, very simple and didn’t need to use many registers to get its work done.

So could we protect the code by disabling interrupts here, just like in the last scenario? I guess so, but it wouldn’t make any sense to do it that way. In order to protect the code from this kind of a problem, you would need to have interrupts disabled during the entire program! Otherwise, any time you enabled interrupts, you would be at risk of your registers being totally corrupted. Needless to say, disabling interrupts during your entire program would not be a viable solution — what’s the point of having interrupts if they’re disabled the entire time?

So what’s the solution to this kind of a problem?

You have to make sure your interrupt handlers play nicely. The first thing an interrupt handler should do is save the values stored in any registers it knows it’s going to be using. Where does it save them? Generally, it will store them onto the stack. Likewise, the last thing an interrupt handler should do is restore any registers it saved when it first began. Also, it may have to execute a special instruction for returning from interrupts as its last instruction.

So rather than guarding against the interrupt everywhere else, you attack it at the source — the interrupt handler has to be nice enough to play along with the rest of your program.

It turns out that some microcontrollers are actually cool enough to save the registers for you. The Freescale 68HC11 is an example of a microcontroller that pushes all of its registers onto the stack before it jumps to the interrupt handler. That’s nice, but the 68HC11 doesn’t have many registers. On a more complex CPU, automatically saving all the registers just isn’t an option.

Some compilers will do all of this for you if you specify that a function is an interrupt handler. You might do this by adding __interrupt__ to its definition:

__interrupt__ void timer_intHandler(void);

It all depends on the compiler and the CPU architecture. You might even have to manually write the interrupt handler’s prologue and epilogue yourself with assembly.

I’m personally a big fan of the way the ARM Cortex-M3 works with interrupt handlers. Before I can get into it, though, I need to talk about ARM functions.

The Procedure Call Standard for the ARM Architecture states that any time you call a function, the first four registers (R0 through R3) are used to pass arguments to the function, and the function can also use them as scratch registers. So any time you call an ARM function, if something important is in R0 through R3, you need to save it before calling the function, because you’re not guaranteed that it will still be there when it finishes up (in fact, if the function returns something, the return value is stored in R0). You are guaranteed, however, that the other registers will still be intact after the function finishes up. Thus, if a function modifies pretty much any register other than R0-R3, it needs to save the value of it so it can restore it to its original state when finished. ARM C compilers automatically generate code that adheres to this procedure call standard. Sounds a lot like what an interrupt handler has to do, right?

The Cortex-M3 takes advantage of this fact. Before it jumps to an interrupt handler, it saves R0, R1, R2, and R3. Then it jumps to the interrupt handler. The C compiler follows the procedure call standard and makes sure it preserves the other registers it uses by generating code at the beginning of the function to push their values onto the stack (and matching code at the end of the function to pop the values off of the stack and back into the registers). Then, when the interrupt handler is finished, the processor restores R3, R2, R1, and R0. Since it works this way, a Cortex-M3 interrupt handler is nothing more than a normal C function! No special assembly or extra attribute needs to be added to the function. It just works out of the box.

As I said, though, on other architectures that don’t take advantage of rules like this, you will probably need to specify to the compiler that a function is an interrupt handler, and it will take care of all the saving registers mumbo jumbo for you.

There is one more thing I want to talk about. How do you tell the CPU what interrupt handler is for what interrupt? Let’s say your CPU has several interrupts — your timer has an interrupt, there’s an Ethernet controller interrupt, a USB interrupt, and several others. How does the microcontroller know that an interrupt handler belongs with a particular interrupt?

This is handled with what is called a vector table. A vector table is just an list of addresses to jump to. The first one might be the reset vector, which is where the microcontroller should jump to when it first starts. The next one could be for the timer, the next for the Ethernet, and so on. The microcontroller’s data sheet will specify which position in the list is for each interrupt. In high-level C terminology, you could say that a vector table is an array of function pointers pointing to the interrupt handlers.

So you create this vector table and put it in a place where the microcontroller expects it to be (often at the beginning of the program’s code), and then the microcontroller will know where to jump whenever an interrupt occurs. Your IDE may help you set up a vector table, and if it doesn’t, there will be sample code somewhere that will show you how to do it.

That’s enough for today. I’ve hopefully gone into more depth about what an interrupt handler is and why it has to be special (except on the Cortex-M3 and possibly others). I hope I didn’t go too crazy when talking about the Cortex-M3 (it’s a really nice architecture, I couldn’t resist!). I’m not sure exactly what my next article will be about, but I’m thinking I may start talking about some of these other crazy peripherals built into a microcontroller such as SPI.

I recently bought a chumby one from Woot. It’s an extremely hacker-friendly device with a 454 MHz Freescale i.MX233 ARM processor, 3.5 inch touchscreen, USB port, accelerometer, speaker, internal USB wi-fi module, and an internal microSD card. It boots from the microSD card, so it’s pretty much un-brickable as long as you keep a backup of the original SD card contents.

It’s an awesomely cool device out of the box, but all of the GUI stuff is based on Flash. Now that’s wonderful and all, but I’m just not much of a Flash guy. I really like working with Qt on embedded devices, so I got a cross compiler up and running, allowing me to design stuff on my desktop computer (running Ubuntu 10.04) and deploy it onto the chumby. I have learned during my time as a developer to document what I did when I do things like this! Two years later it’s hard to remember exactly what I did. I learned this the hard way and now I always document a procedure like this as I’m going through it. I figured while I’m documenting it, I might as well share it with the world. These instructions will walk you through creating a modern cross compiler toolchain for the i.MX233 (compatible with the chumby’s libc), using that toolchain to compile Qt/Embedded 4.7.2, and finally, creating apps on your build machine and running them on the chumby.

The basics

Start out with an Ubuntu 10.04 (“Lucid Lynx”) installation. This procedure will probably work in newer and older versions, but I’m assuming you’re using 10.04. I’m going ahead and using an amd64 install of Ubuntu, but it should work fine in an i386 install as well. Once you have that installed (however you want to do it — directly on the computer, Wubi, in a virtual machine, or whatever other crazy install method you can conjure up), we’re ready to begin.

First of all, we need to install some prerequisite packages for various purposes. In a terminal window, type:

sudo apt-get install build-essential bison flex texinfo automake libtool cvs libncurses5-dev subversion

crosstool-NG

After getting all this stuff downloaded and installed, we’re ready to start creating the cross compiler. Download crosstool-NG (I used version 1.10.0) and unpack it somewhere. crosstool-NG is absolutely amazing. It saves so much trouble while creating cross compilers. You just go through a menuconfig interface similar to the Linux kernel’s menuconfig interface, telling it exactly what you want. If you’re lucky, you just tell it to work and it grabs all the source code, patches it, and compiles it automatically, leaving you with a fully-functional cross compiler after a half-hour or so. In this case, I have verified that the configuration I’m going to give you will work. In other cases, you may have to do some trial and error because some versions of C libraries and binutils don’t work with some versions of gcc. Generally you want to use tools that came from the same time period, as this will give you a better chance of everything working together. Anyway, go into the crosstool-ng directory in a terminal, and type:

./configure --local
make

The –local option tells crosstool that we will be running it directly out of the directory we unpacked it to. Otherwise it would install itself into /usr/local or somewhere like that. I prefer it this way rather than putting it into /usr/local. This should create a script called ct-ng. We will be using this script to configure and create the toolchain.

So let’s start setting up the toolchain! I’d just give you the config file, but I’d rather walk you through setting up crosstool so you better understand how it works. Type:

./ct-ng menuconfig

After a few things finish setting up, you’ll be greeted with graphical interface in the terminal. You can move up and down with the arrow keys. The enter key will go to the selected item. Hit the Esc key twice to go back. Let’s go through the sections:

Paths and misc options

It turns out I’m going to break my own rules here. I’m going to choose newer compilers and binutils with an older glibc. It happens to work in this case, so no harm done. I’m using an older glibc because the glibc on the chumby’s root file system is version 2.8, so I’d like our version to match. If we use a newer version, some binaries will not be compatible with the older glibc, so we’d have to replace chumby’s glibc, and I’d rather not.

Highlight Use obsolete features and press the space bar to select it. This will allow us to choose an older glibc that is probably not compatible. (We’re actually going to use eglibc, which plays better with our setup than glibc)

Scroll down and you’ll notice that Prefix directory is set to “${HOME}/x-tools/${CT_TARGET}. This means it’ll create a folder called x-tools in your home directory which will store all cross toolchains you create. I left this alone, but if you’d like it elsewhere, go ahead and change it now.

Keep scrolling down until you find Number of parallel jobs. This one will save you some time if you have a dual- or quad-core processor. While creating the toolchain, this will allow some of the files to be compiled concurrently, making better use of your multiple-core CPU. For you techies out there, it’s letting you specify the -j option passed to the make commands that will run. I generally set this to the number of cores I have available. Since my 4-core Core i7 has hyperthreading, I normally choose 8. In a VMware virtual machine where I have given the VM two cores to work with, I’d choose 2. Different people have different opinions on what’s the best value here, but I’d say stick with the number of cores you have available and you’ll be fine. If you pick too high of a value your entire system will slow to a halt, and if you pick too low of a value, you’ll be under-utilizing your CPU.

Okay, sorry about that long explanation. Now we’re ready to move to the next section. Hit Esc twice to go back to the menu. Move down to Target options and hit enter.

Target options

Highlight Target Architecture (alpha) and hit enter. Scroll down to arm and hit the space bar. We are telling crosstool-NG that we’re building a cross compiler that will target the ARM architecture, which is what the chumby uses.

chumby uses a Freescale i.MX233, which is an ARM926EJ-S. We need to tell crosstool which CPU we’re targeting, so scroll down to Emit assembly for CPU and hit enter. Type “arm926ej-s” (without the quotes) and hit enter.

Scroll down to Floating point and hit enter. Scroll down to software and hit the space bar. The i.MX233 does not have a hardware floating point unit, so we need to choose softfloat here (otherwise, the kernel emulates hardware floating point instructions, and it gets terrible performance).

We’re done here, so hit Esc twice, move down to Toolchain options, and hit enter.

Toolchain options

Our toolchain will be called “arm-chumby-linux-gnueabi”. The “chumby” part of this is called the vendor string, so we need to configure it as such. The vendor string makes no technical difference–it’s purely a cosmetic thing. Scroll down to Tuple’s vendor string, hit enter, backspace until you have erased “unknown”, type “chumby”, and press enter.

Exit this menu and go to the Operating System menu.

Operating System

Change Target OS to linux. You’ll notice that the kernel version is set at 2.6.37. It may not make a huge difference in some cases, but in this case, we need to pick a kernel version closer to the chumby’s kernel (2.6.28-chumby). Otherwise, the touchscreen library won’t work. We could probably get a tarball of the chumby’s kernel and be exact, but rather than bother with that, I set it at 2.6.27.57. Once you’ve made that change, exit this screen and go to Binary utilities.

Binary utilities

We’re not going to change anything here, but this is where you would change the version of binutils. Leave it at version 2.20.1.

Exit again and go to C compiler.

C compiler

Leave gcc at version 4.4.5. You need to scroll down under Additional supported languages to C++ and hit the space bar to make sure the toolchain is enabled for compiling C++. Now move down to C-library.

C-library

The C library is already set as eglibc, which is what we want. Change eglibc version to 2_8 (OBSOLETE). Normally this would be a bad idea to use an old version of eglibc combined with newer binutils and gcc, but we need version 2.8 in order to be compatible with the chumby’s provided glibc, and it works OK with the newer binutils and gcc. This is the sole reason we had to enable Use obsolete features when we were starting out. We also need to add an extra option here to fix a problem I encountered when I first tried to build this toolchain. Go to extra target CFLAGS and set it to “-U_FORTIFY_SOURCE”. Exit this screen and head over to Debug facilities.

Debug facilities

Go ahead and enable gdb (leaving it at version 6.8). We probably won’t use it, but it won’t hurt to have it. Leave this screen.

All done configuring

We don’t need to mess with anything else, so hit Esc twice and you will be asked if you want to save your new configuration. Hit enter, and you’re ready to build your toolchain!

Type:

./ct-ng build

crosstool-NG will update you on the status of your build as it keeps going on, but for now you can sit back and relax. If you’d like, skip ahead and start downloading some of the other stuff you will need to have as you progress further through these instructions. Depending on the speed of your computer, this compilation might take 15 minutes to an hour.

If all goes well, crosstool-NG will finish by saying “Finishing installation (may take a few seconds)…” followed by a shell prompt. At this point, your cross toolchain is complete! Assuming you left it to be installed at the default location, you can now compile programs for the chumby by using:
~/x-tools/arm-chumby-linux-gnueabi/bin/arm-chumby-linux-gnueabi-gcc

I made a quick test app:

chumbytest.c:

#include <stdio.h>

int main(int argc, char *argv[])
{
    printf("It works!\n");
}

compiled it:

~/x-tools/arm-chumby-linux-gnueabi/bin/arm-chumby-linux-gnueabi-gcc chumbytest.c -o chumbytest

and transferred it to the chumby with scp to make sure it worked before advancing any further (assuming ssh has been enabled on the chumby):

scp chumbytest root@MyChumbysName.local:/tmp

to run it on the chumby over ssh:

ssh root@MyChumbysName.local
/tmp/chumbytest
exit

As long as it worked as expected, you’re ready to start working on getting Qt compiled. Well, kind of.

tslib

We need to do one prerequisite in order to get the touchscreen working, and it’s a library called tslib. Download it and extract it.

Start out inside the tslib directory and type:

./autogen.sh

Let’s go ahead and add our cross compiler to our path:

export PATH=~/x-tools/arm-chumby-linux-gnueabi/bin:$PATH

Now we’re ready to configure tslib to be cross compiled, and we’re going to tell it to install itself into a temporary directory inside our current directory. There are two hacks we have to do here–the first is we define a variable ac_cv_func_malloc_0_nonnull to be yes to avoid a compile error, and we also add -U_FORTIFY_SOURCE to the CFLAGS to avoid other errors, the same way we had to during the compilation of eglibc. Here’s the complete command:

ac_cv_func_malloc_0_nonnull=yes ./configure --prefix=$PWD/tslib --host=arm-chumby-linux-gnueabi CFLAGS=-U_FORTIFY_SOURCE

Compile it:

make -j 2 (or 4 or 8 or whatever, depending on how many cores you have)

Install it:

make install

Now there should be a tslib subdirectory inside the tslib directory, containing the compiled library, plugins, and a config file in etc. We really need to put the tslib library and include files into the appropriate place in our cross compiler’s sysroot so we can use it when compiling. Type:

sudo cp -R tslib/lib/* ~/x-tools/arm-chumby-linux-gnueabi/arm-chumby-linux-gnueabi/sysroot/usr/lib

to copy all the libraries into the cross compiler. We should also copy the include files in there:

sudo cp tslib/include/tslib.h ~/x-tools/arm-chumby-linux-gnueabi/arm-chumby-linux-gnueabi/sysroot/usr/include

This will simply make it easier in the future when compiling, so we don’t have to specify where the libraries are located. Now, let’s get the touchscreen working on the chumby before we compile Qt. Transfer the compiled tslib subdirectory to the chumby’s storage:

tar -cf - tslib | ssh root@MyChumbysName.local tar -xf - -C /mnt/storage

Now we need to get a shell into the chumby, quit the chumby control panel, and set up a few environment variables so we can calibrate the touchscreen:

ssh root@MyChumbysName.local
/usr/chumby/scripts/stop_control_panel
export LD_LIBRARY_PATH=/mnt/storage/tslib/lib:$LD_LIBRARY_PATH
export TSLIB_TSDEVICE=/dev/input/by-id/soc-noserial-event-ts
export TSLIB_CALIBFILE=/mnt/storage/tslib/etc/pointercal
export TSLIB_CONFFILE=/mnt/storage/tslib/etc/ts.conf
export TSLIB_PLUGINDIR=/mnt/storage/tslib/lib/ts

Edit /mnt/storage/tslib/etc/ts.conf and uncomment the “module_raw input” line.

You’re now ready to calibrate the touchscreen! Type:

/mnt/storage/tslib/bin/ts_calibrate

If all goes well, you should see “TSLIB calibration utility” and “Touch crosshair to calibrate” appear on the screen. Touch the crosshair for each corner and then the center, and the app will exit (the screen will stay the same, though). Now you’re calibrated! Test it by typing:

/mnt/storage/tslib/bin/ts_test

You should be able to drag the crosshair around on the screen. Hit ctrl-C to exit.

OK, so your touchscreen setup is now complete. You know all of those export commands you just typed before calibrating? You may want to put them into a file somewhere so they can be sourced whenever you need to run them. Whenever you want to use the touchscreen, those environment variables need to be set that way. We are now finally ready to compile Qt.

Qt

Here we go! The exciting part has arrived. We are ready to compile Qt. First, you need to download the latest source code for Qt. Pick the LGPL version unless you have a commercial license. Note that these instructions are based on version 4.7.2. Once you’ve downloaded the tarball, extract it and enter its directory. We now need to set up a Qt spec for the chumby.

Spec

Inside the mkspecs directory is a list of other directories containing specs for many different architectures. There is also a directory called “qws”. We want to use a qws spec, because our Qt will be drawing directly to the frame buffer device without having to have a window server like X11 installed. Instead, Qt will be its own window server. In the qws directory, there is a spec called linux-arm-gnueabi-g++. This one is very similar to the toolchain we are using, so let’s go ahead and duplicate it:

cd mkspecs/qws
cp -R linux-arm-gnueabi-g++ linux-arm-chumby-gnueabi-g++

Now, you need to edit the qmake.conf file inside of our new linux-arm-chumby-gnueabi-g++ directory. You will see several references to tools prefixed with “arm-none-linux-gnueabi-“. You should replace this prefix with the prefix for your toolchain. In fact, I would put the full path to each of these tools. For example, change arm-none-linux-gnueabi-gcc to /home/yourname/x-tools/arm-chumby-linux-gnueabi-gcc, and so on. It may seem stupid because you could always add the directory to your path, but when we’re executing the cross compiler from inside of Qt Creator, it will be easier if the full path is already there. Once you’re done with this, save the file and we’re ready to go!

Compiling

Here is the complete command, along with descriptions of what every option does below it:

./configure -opensource -embedded arm -xplatform qws/linux-arm-chumby-gnueabi-g++ -no-feature-cursor -qt-mouse-tslib -nomake examples -nomake demos

-opensource: specify that we’re using the LGPL. If you have a commercial license you should change this.
-embedded arm: specify that we’re compiling Qt/Embedded for ARM.
-xplatform qws/linux-arm-chumby-gnueabi-g++: specify the make spec that will be used for compiling. This tells Qt where our cross compiler is, and info about the cross compiled system.
-no-feature-cursor: hides the cursor, since it doesn’t make much sense to have one with a touchscreen. If you want a cursor, you can remove this option.
-qt-mouse-tslib: tells Qt to use tslib for its mouse driver
-nomake examples: Tells Qt to skip the examples. I already know how to use Qt so I don’t need a bunch of examples compiled. If you want them, keep it there.
-nomake demos: Same as above.

There are other options you can use to turn off specific features. If you don’t envision ever using WebKit, you can turn it off (-no-webkit). Anyway, type the command above, and you’ll have to agree to the open source license. Once that’s done, it’ll take a few minutes to configure the build. When the configure is complete, type:

make -j 2 (or whatever number of cores you have available)

Once that’s done (and it’ll be a while, trust me on this one!) you can install the libraries by typing:

sudo make install

The reason you need sudo is that Qt defaults to installing in /usr/local/Trolltech. You can change this with the -prefix option on the configure script, but I don’t mind Qt residing there. Once that’s done, you should have a functional Qt. Send the Qt libraries over to the chumby:

cd /usr/local/Trolltech/QtEmbedded-4.7.2-arm/
tar -cf - lib | ssh root@MyChumbysName.local tar -xf - -C /mnt/storage

And now Qt should be on the chumby’s SD card, with its libraries stored in /mnt/storage/lib. We can’t test it until we make a quick Qt app, though, so let’s install the Qt SDK.

Qt SDK

The Qt SDK is a massive download, currently in beta, but it works just fine. We’re going to install it and use it to create a test app to make sure Qt works. First, download the SDK from the Qt download site. I downloaded the Qt SDK 1.1 Beta online installer for Linux 64-bit from here (since my development machine is 64-bit). Make it executable and run it:

chmod +x Qt_SDK_Lin64_online_v1_1_beta_en.run
./Qt_SDK_Lin64_online_v1_1_beta_en.run

At this point, you’ll need to wait for it to finish “Retrieving information from remote installation sources…”. Once that’s done, accept the defaults, allow it to install in ~/QtSDK or wherever else you want it, and eventually you’ll have to wait for it to finish downloading and installing the SDK. Once it’s done, go ahead and allow it to Launch Qt Creator, but uncheck Open Qt SDK ReadMe.

After a few moments, Qt Creator will pop up. We now need to add our cross toolchain so that Qt Creator knows about it.

Adding the cross toolchain

Click on Tools, and choose Options. On the left, click Qt4. You should see a list of available Qt versions. We are going to add a new one, so click the blue + on the right side of the window. Where it says <specify a name>, type “Chumby Qt”, and where it says <specify a qmake location>, click Browse… and navigate to your cross compiled Qt’s qmake binary (/usr/local/Trolltech/QtEmbedded-4.7.2-arm/bin/qmake). Below, it should say “Found Qt version 4.7.2, using mkspec qws/linux-arm-chumby-gnueabi-g++ (Desktop)”. It’s inaccurate in claiming that this is a desktop target, but it’s ok — it still works. Click OK to save your new Qt version.

Creating a test app

I didn’t want to turn this into a tutorial on how to use Qt, but I at least want to walk you through creating a basic app, cross compiling it, getting it onto the chumby, and running it. Click on File and choose New File or Project…. Leave Qt Gui Application highlighted and click Choose…. Name your project TestChumbyQt and save it wherever you’d like. When it asks you to set up targets, uncheck all the targets except for the Chumby Qt target we created earlier. Click Next and leave the QMainWindow class name alone (MainWindow is fine for this example). Accept all the default choices.

Now we are ready to create the test app. Expand the Forms folder on the left and open mainwindow.ui. Click on the background of the newly-created window. Next, on the right side, there should be a “geometry” property. Click the + to expand it. Change width to 200 and height to 150. This should shrink the window a bit. On the left, drag a push button onto the main window. You can resize it to make it easier to press if you’d like.

Compiling the test app

Ok, we’re ready to compile the app. Go to Build and choose Build Project “TestChumbyQt”. If it asks you to save, click Save All. You should see a progress bar on the left side and eventually it will turn green, meaning the compile succeeded.

We’re ready to go. I put my test app in ~/Documents, so your command may be slightly different from mine:

scp ~/Documents/TestChumbyQt-build-desktop/TestChumbyQt root@MyChumbysName.local:/mnt/storage

Environment variables

Now, we need to define a few environment variables in order for Qt to work. You will need to do all of the exports we did above when we were testing out tslib, so if you still have that shell open you can skip doing those again. I’m listing them again just in case, though, plus some extras you need to do regardless.

export LD_LIBRARY_PATH=/mnt/storage/tslib/lib:$LD_LIBRARY_PATH
export TSLIB_TSDEVICE=/dev/input/by-id/soc-noserial-event-ts
export TSLIB_CALIBFILE=/mnt/storage/tslib/etc/pointercal
export TSLIB_CONFFILE=/mnt/storage/tslib/etc/ts.conf
export TSLIB_PLUGINDIR=/mnt/storage/tslib/lib/ts

And the ones you need to do:

export QWS_DISPLAY="LinuxFb:mmWidth80:mmHeight52"
export QT_QWS_FONTDIR=/mnt/storage/lib/fonts
export QWS_MOUSE_PROTO=tslib

The QWS_DISPLAY variable tells Qt about the physical dimensions of the chumby’s screen in millimeters. I did a very crude measurement with a ruler. If you don’t specify it, all the text will appear tiny because we’re working on a small screen. QT_QWS_FONTDIR tells Qt where to look for its fonts. Because it thinks it’s stored in /usr/local/Trolltech, we need to specify this so it knows where it should actually look. Finally, QWS_MOUSE_PROTO tells Qt to use tslib rather than the generic Linux input system.

As I said earlier, you will want to make a file you can source that includes all of these environment variable definitions — they are necessary in order to start a Qt app. Notice I didn’t add /mnt/storage/lib (where we put the Qt libraries) to LD_LIBRARY_PATH — the reason I didn’t is because the chumby already has that directory in its pre-supplied LD_LIBRARY_PATH, so I didn’t want to put it in there twice.

Running the test app

We’re ready to go now. In an SSH session to the chumby, type:

/mnt/storage/TestChumbyQt -qws

(-qws tells TestChumbyQt to be a window server. If you run other apps while it’s running, you don’t want to pass -qws to them–only one app should be the window server.) You may see an error about being unable to open /etc/pointercal, but that’s okay–our environment variable tells it to look elsewhere anyway. If all goes well, your window should appear on the chumby’s screen. Try pressing the button and dragging the window around. Everything should work! You can quit the app by pressing control-C in the terminal, or tapping the X in the upper right corner of the window.

Congratulations, you now have Qt working on the chumby. There are ways to get a Qt app to run at bootup now rather than the flash-based GUI, but that’s beyond the scope of this document. All that’s important is now you know how to run Qt apps.

All done

Unfortunately, I just don’t have the bandwidth necessary to host any binaries or source code for this. However, I have (hopefully) provided you with all the info you’ll need to create the libraries yourself. I have not yet figured out how to make a Windows-based toolchain for this — I know how to create the ARM cross compiler for Windows, but I just don’t know how to get Qt to compile inside of Windows. If I ever discover how, I’ll make another post about it. So for now, you will need a Linux environment for your development.

I know some of what I did was a little crude — I should have specified the ultimate destination directory for tslib when I configured it with –prefix, and I also probably could have specified in the configure script some of the paths that I had to redefine with environment variables, but this works for now. There’s nothing special about /mnt/storage, by the way — I just picked it because it was easy to put stuff there. It might be better suited for a special directory in /usr or something along those lines.

If you run into any problems, feel free to leave a comment and I’ll try to figure out what’s wrong. Enjoy!